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FOREWORD 


This  report  is  one  of  a series  written  by  me  under  a 
common  title:  "Advanced  methodologies  for  human  factors 
engineering  research."  It  covers  one  segment  of  the  total 
data  collection  and  analysis  process  in  a complete  research 
program,  namely,  that  phase  dealing  with  the  screening  of  a 
very  large  number  of  variables  to  discover  the  critical 
cnes.  According  to  the  research  strategy  that  I am  trying 
to  promote,  the  screening  process  is  not  a complete 
experiment  and  should  only  be  used  after  a thorough  analysis 
of  the  real  world  is  made  to  develop  a list  of  candidate 
variables  to  be  screened  and  before  a later  effort  to 
develop  a complete  and  accurate  response  surface.  Screening 
designs  and  response  surface  designs  are  not  two  separate 
designs,  but  the  first  is  a first  stage  of  the  second;  the 
second  is  an  outgrowth  of  the  first. 

This  report  describes  a screening  process  that  is  an 
improvement  over  that  written  in  the  earlier  reports, 
integrating  economical  multifactor  research  techniques  with 
those  that  keep  the  data  relatively  free  from  trend  effects. 
Use  of  this  report  presumes  that  the  reader  is  already 
familiar  with  the  earlier  reports,  particularly  those  on 
economical  multifactor  designs,  on  building  trend-robust 
designs,  and  on  ridge  regression  analysis,  as  well  as  the 
basic  principles  for  conducting  economical  behavioral 
research.  New  ideas  for  improved  analysis  and  for  handling 
multiple  response  data  are  introduced  here. 

The  techniques  discussed  in  this  report  are  treated 
unevenly.  Forced  by  time  limitations  to  either  go  into 
considerable  detail  regarding  a small  piece  of  the  screening 
process,  or  provide  an  overview  of  the  complete  process,  I 
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chose  the  latter.  Even  for  the  more  sparsely  treated 
techniques,  I have  tried  to  present  enough  information  that 
would  not  only  direct  the  reader's  attention  to  potentially 
useful  methodologies  but,  *y  judicious  sifting  and  digesting, 
would  also  help  clarify  the  original  papers  when  they  are 
read.  Only  one  important  step  — data  transformation  — 
was  omitted  because  I was  not  satisfied  that  the  method  I 
had  would  do  the  job  properly. 

Eventually,  the  missing  details  will  have  to  be  added, 
along  with  more  details  on  the  other  phases  of  the  research 
process  after  screening.  Although  experience  is  needed  to 
determine  the  full  power  of  this  approach,  merely  studying 
20,  30,  or  40  variables  in  a systematically-manipulated 
experiment  cannot  help  but  improve  the  predictive  quality 
of  the  research  or  the  generalizability  of  the  data  base. 
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Screening  designs  — a class  of  fractional  factorials  — 
are  systematic  data  collection  plans  that  enable  the  effects 
of  a very  large  number  of  factors  to  be  estimated  economically. 
Screening  designs  are  used  primarily  in  the  second  phase  of 
a total  research  program  where  they  are  intended  to  determine 
which  of  the  great  many  factors  have  non-trivial  effects  on 
the  performance  of  a particular  task.  Screening  designs  are 
to  be  used  to  identify  important  factors,  not  to  obtain  an 
accurate  representation  of  the  experimental  space.  This 
latter  operation  will  occur  in  subsequent  phases  of  the 
research  program. 

The  strategy  for  using  screening  designs  in  this  manner 
stems  from  the  observation  that  a great  many  psychological 
and  human  factors  experiments  investigate  trivial  factors. 

Simon  (1975b) , in  an  analysis  of  239  experiments  published 
in  Human  Factors  over  a fourteen  year  period,  found  that  in 
experiments  studying  from  one  to  five  factors,  24  percent  of 
the  494  main  effects  examined  accounted  for  one  percent  or 
less  of  the  total  variance  in  the  experiment.  Forty-one 
percent  of  main  effects  accounted  for  only  four  percent  of 
the  total  variance  in  the  experiment. 


As  might  be  expected,  the  more  factors  included  in  a 
single  experiment,  the  more  frequently  trivial  effects  were 
found.  Similar  conditions  have  been  found  in  analyses  of 
other  journals  that  publish  psychology  experiments  (Gallo, 
et  al , 1977;  Dunnette,  1966,  p 35), 

With  the  great  many  factors  that  are  likely  to  affect 
performance  in  any  given  task,  one  must  wonder  why  any  psy- 
chologist, interested  in  predicting  and  controlling 
performance,  would  study  factors  having  trivial  effects. 

I 


Why  not  first  study  the  factors  accounting  for  the  large 
effects?  The  principle  of  maldistribution  (Budne,  1959? 

Simon,  1973?  1976b)  leads  us  to  expect  that  a relatively  few 
factors  account  for  most  of  the  variance.  These  should 
be  investigated  first  in  order  to  build  a structure  of  data 
within  which  marginal  effects  can  be  located  and  about  which 
confidence  limits  can  be  established. 

Of  course,  the  answer  to  "why?"  is  that  until  the 
experiments  are  completed,  one  would  not  know  which  factors 
are  important.  But  this  is  where  screening  designs  become 
applicable.  Instead  of  doing  many  three1-  or  four-factor  ex- 
periments, with  highly  replicated  designs,  requiring  a great 
many  observations  to  collect  redundant  information  of  limited 
value,  tho  screening  designs  provide  a means  of  examining  a 
great  number  of  factors  with  the  maximum  amount  of  information 
with  a minimum  amount  of  redundance  and  relatively  few  obser- 
vations. What  the  results  from  many  little  traditional 
experiments  cannot  do,  but  which  results  from  the  screening 
design  can,  is  to  order  the  factors  according  to  the  size  of 
their  effects  and  to  discover  interactions  among  factors  that 
appear  within  the  same  experiment.  Screening  designs  do  all 
this  economically  for  they  can  be  used  to  study  N factors  with  2N 
observations  (although  the  size  of ' the  designs  in  this  report 
will  all  be  equal  to  some  power  of  2).  Thus  if  there  are  25 
factors,  for  example,  to  be  ranked  in  terms  of  their  impor- 
tance, only  64  observations  would  ordinarily  be  required  when 
screening  designs  are  used.  Furthermore,  the  precision 

with  which  the  main  effects  are  estimated  is  usually  much 

* 

greater  than  the  effects  measured  in  many  smaller,  yet 
highly  replicated  studies.  • The  effects  obtained  from  screen- 
ing studies  not  only  permit  the  ranking  of  factor  effects  on 
a quantitative^  scale,  but  can  provide  an  equation  approximat- 
ing the  experimental  space  if  that  space  ran  be  represented 
by  a linear  model. 
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The  beauty  of  using  a screening  design  is  that  once  the 
important  factors  have  been  identified  (step  one),  the  same 
data  can  be  used,  if  supplemented  by  relatively  few  additional 
observations  at  new  experimental  conditions,  to  complete  a 
response  surface  (step  two)  capable  of  accurately  approximat- 
ing the  experimental  space  defined  by  the  original  set  of 
25  factors.  For  several  hundred  observations,  a reasonable 
approximation  of  a 2 5- factor  space  is  possible.  These  capa- 
bilities arise  through  the  appropriate  application  of  the 
principles  of  economical  multifactor  research  (Simon,  1973)  , 
the  basic  strategy  being  to  collect  only  Lhe  data  needed  to 
supply  the  information  required  at  each  particular  phase  of 
the  research  program.  Screening  designs  are  employed  in  the 
second  phase  to  help  (in  as  economical  an  effort  as  possible) 
the  investigator  decide  what  factors,  what  measures,  what 
range  of  values  should  be  investigated  in  greater  detail  at 
a later  stage  of  the  program. 
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• How  to  design  Resolution  IV  screening  designs  robust  | ;J 

to  linear,  quadratic,  and  cubic  trend  effects  | 

without  replicating  the  basic  design.  j 

! 

i 

Complete  designs  are  provided  requiring  8,  16,  j 

or  32  observations  to  quantitatively  order  the  j 

effects  of  up  to  8,  16,  or  32  factors. 


• How  to  prepare  to  use  screening  designs:  preliminary 
empirical  studies  and  analyses. 


• How  to  assign  operational  factors  to  the  design  to 
keep  them  robust  to  trend  while  minimizing  the 
number  of  difficult  or  time-consuming  level 
changes . 


• How  to  add  center  points  to  a screening  design  to 
roughly  estimate  error  variance  and  to  provide 
the  data  needed  to  test  how  well  a linear  model 
fits  the  empirical  data. 


• How  to  include  multiple  subjects  in  the  screening 
design:  dimensionalized  as  factors. 
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II.  2 


RESOLUTION  IV  SCREENING  DESIGN  PLANS 
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In  this  section,  how  to  construct  a special  type  of 
screening  design  and  the  preparations  recommended  for  using 
them  will  be  described.  The  section  is  written  with  the 
assumption  that  the  reader  is  familiar  with  the  information 
on  fractional  factorials  in  general  and  screening  designs 
in  particular,  as  described  by  Simon  (1973)  in  an  earlier 
report,  or  its  equivalent.  The  reader  should  also  be 
familiar  with  certain  techniques  for  constructing  trend-free 
2 designs,  which  may  be  found  in  Simon  (1974)  or  the 
original  papers.  The  techniques  described  in  those  two 
reports  are  consolidated  in  this  report  to  provide  an  ex- 
tremely economical  and  efficient  experimental  design  for 
identifying  critical  factors. 

Although  the  methods  of  construction  are  described  here, 
three  complete  screening  designs  are  provided  in  this  report  in 
spite  of  a strong  personal  belief  by  the  author  that  "cookbook" 
applications  of  experimental  plans  are  to  be  deplored  and  are 
bound  to  degrade  the  quality  of  research  in  the  long  run. 
Cookbook  applications  enable  the  uninformed  to  mimic  the 
efforts  of  qualified  investigators  enough,  in  many  cases,  to 
provide  a face  validity  to  their  efforts  while  masking  sloppy 
data  collection,  an  inadequate  analysis,  and  a misinterpreta- 
tion of  results.  They  allow  the  lazy  investigator  to  fit  his 
problems  to  his  methods  and  his  experiments  to  the  designs 
that  are  available  in  a book,  rather  than  to  design  each 
experiment  in  a way  that  is  likely  to  provide  the  most  valid 
information  needed  for  the  problem  at  hand. 

The  justification  for  providing  these  ready-made  designs, 
therefore,  lies  mainly  in  their  utility  in  illustrating  the 
design  principles  described  in  this  report  and  in  reducing 
the  amount  of  routine  calculations  an  investigator  would  have 
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to  perform  in  developing  the  designs  on  his  own.  Proper  use 
of  the  designs  still  requires  a great  deal  of  involvement  by 
the  investigator  in  order  to  fit  them  to  his  problem. 

CHARACTERISTICS  OF  THE  SCREENING  DESIGNS  IN  THIS  REPORT 


t 
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Each  design  exhibits  the  following  characteristics: 

1.  Multifactor.  A single  run  of  these  designs  can  be 
used  to  estimate  the  effects  of  up  to  8,  16,  or  32 
factors.  By  analogy,  still  larger  designs  can  be 
constructed.  However,  in  practice,  if  adjustments 
for  trend  effects  are  to  be  made,  one  degree  of 
freedom  for  each  order  of  trend  (i.e.,  linear, 
quadratic,  or  cubic)  must  be  set  aside,  reducing 
the  number  of  experimental  factors  that  can  be 
studied. 

2.  Economical,  The  effects  of  up  to  K factors  can  be 
estimated  with  N observations,  when  K equals 

N/2  and  N equals  some  power  of  2 (e.g.,  21*,  2  3 *  5,  26  ). 
The  designs  in  this  report  require  16,  32,  and  64 
experimental  conditions  in  a single  run  for  studying 
up  to  8,  16,  and  32  factors, respectively. 

3.  Quasi-saturated.  The  designs  allow  for  no  inde- 
pendent estimate  of  the  error  term  unless  one 

wishes  to  assume  that  two-factor  interaction 
strings  are  negligible.  If  fewer  than  the  maximum 
possible  number  of  factors  are  studied,  the  effects 
of  t.hree-f actor  interaction  strings  can  be  esti- 

mated. Without  additional  information,  it  would 

be  incautious  to  assume  them  to  be  equivalent  to 
an  independent  estimate  of  error. 
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4.  Two-level  factors.  These  designs  sample  only  two 
levels  or  a factor,  although  they  could  be  adapted 
to  handle  four  levels  per  factor  if  necessary 
(Cochran  and  Cox,  1957,  p 273).  However,  since 
these  plans  are  to  be  used  for  screening,  about  the 
only  justification  for  a four-level  factor  would  be 
when  there  are  four  conditions  of  a qualitative 
variable.  The  two  levels  would  be  selected  near 

psychophysical  or  practical  performance  limits  of 
the  factor  to  measure  the  full  effect. 

5.  Resolution  IV.  All  main  effects  can  be  isolated 
from  one  another  and  from  all  two-factor  interac- 
tions. Each  main  effect  will  be  aliased  with  a 
different  string  of  three-factor  interactions. 
Two-factor  interactions  will  be  aliased  with  one 
another  in  isolated  strings. 

6.  Trend-robust.  The  experimental  conditions  of  each 
design  are  ordered  so  that  without  replication, 
estimates  of  many  main  effects  will  be  totally  un- 
affected by  linear,  quadratic,  and  cubic  trends  - 
for  example,  subject  learning  or  equipment  drift  - 
confounded  with  the  effects  of  interest.  All  but  a' 
few  effects  will  be  robust  to  trends.  The  designs 
are  arranged  so  that  it  is  easy  to  identify  the  more 
trend-robust  columns  to  which  factors  are  assigned. 


b 


* 


7.  Factor-level-change  sensitive.  If  the  levels  of  a 
factor  are  difficult  or  time-consuming  to  change, 
the  investigator  may  use  the  change-counts  provided 
with  each  design  to  assign  the  dif f icult-to-change 
factor  to  a column  requiring  few  changes. 
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8.  Robust  to  experimenter  error.  These  2 ^ designs 

are  remarkably  robust  to  variations  in  setting  the 
experimental  conditions  of  the  independent  varia- 
bles, even  when  the  experimenter  is  unaware  of  the 
existence  of  the  error  (Box,  1963)  . 

9.  Modular.  Center-points  and  additional  levels  for 
each  factor  can  be  added  to  the  designs  to  provide 
the  data  needed  to  estimate  non-linear,  quadratic 
effects  of  a second-order  response  surface.  Now 
blocks  of  experimental  conditions  can  be  added  to 
the  original  Resolution  IV  design  to  create 
Resolution  V designs  that  form  the  center  of  a 
central-composite  design. 
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CONSTRUCTING  RESOLUTION  IV  SCREENING  DESIGNS 


1 

Since  screening  designs  are  merely  a form  of  the  2 K 
fractional  factorial  designs,  they  can  be  constructed  in  a 
number  of  different  ways.  Several  methods  in  addition  to 
the  one  used  for  the  plans  in  this  report  are  described  in  i 

order  to  provide  the  user  with  the  greatest  degree  of 
flexibility  of  method. 
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From  Resolution  III  Designs 

Simon  (1973,  pp  89-116)  explains  the  techniques  devel- 
oped by  Box  and  Hunter  (1961)  and  Daniel  (1962)  for  con- 
structing Resolution  IV  screening  designs  from  two  Resolution 
III  designs.  A Resolution  III  design  is  constructed  by  first 
writing  down  the  sign  matrix  for  the  full  factorial  and  then 
aliasing  additional  factors  with  the  interactions  of  the 
original  design.  For  example,  a seven-factor  Resolution  III 
design  with  eight  observations  would  bo  constructed  by 
aliasing  new  factors  with  the  interactions  of  a 2^  factorial 
plan,  thus: 
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Original  2 4 factorial)  (I) 


Column  Headings 
A » C AB  AC  BC  ABC 


2j~j  created  by  aliasing)  (I)  A B C D K V 
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With  this  design,  N-l  main  effects  can  be  isolated  from  one 
another  but  not  from  two-factor  or  higher  interactions.  The 
defining  generators  are: 

(I)  “ H “ ABD  - ACE  - BCE  « ABCG 

The  research  strategy  would  be  to  collect  and  analyze  the  data 
from  the  conditions  of  this  first  block  (a  Resolution  III 
design)  in  order  to  discover  if  the  design,  the  factors,  and 
the  range  of  conditions  are  adequate  and  to  make  whatever 
changes  are  needed  before  collecting  additional  data.  When 
a great  many  factors  are  being  investigated,  information  from 
this  single  block  may  be  sufficient  in  some  cases  to  drop 
some  of  the  variables  before  commencing  data  collection  on 
the  second  block. 
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When  the  investigator  is  ready  to  collect  more  data,  he 
constructs  a second  design  composed  of  experimental  condi- 
tions for  a second  Resolution  III  block  that  are  the  "fold- 
overs"  of  the  first  block.  In  the  foldover  design,  the  levels 
of  all  conditions  — including  (I)  - Factor  H — are  reversed. 
The  defining  generators  for  this  second  block  would  be: 


(I)  **  -H  » -ABU  « -ACE  = -BCE  *>  ABCG 

The  defining  generators  for  the  combined  design  can  be 
derived  by  expanding  each  set  of  generators  into  the  full 
set  of  defining  contrasts  and  adding  the  two  sets  together. 
Box  and  Hunter  (1961,  p 338)  provide  a rule  that  simplifies 
the  process.  They  write: 
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. . . when  o.  design  in  famed  containing  2^  inm  fran 
a design  containing  8?*  runs  by  implicating  the  sft 
design  with  wworse d signs  and  associating  seme 
further  faatoi*  X with  the  2^  plus  ones  and  P?*  minus 
onost  then  a general  inde  for  obtaining  the  generators 
and  defining  relation  of  the  ne w design  from  the 
generators  and  defining  relations  of  the  old  design 
is  as  follows:  1)  All  generators  which  contain  an 
even  number  of  diameters  in  the  original  design  are 
retained  as  generatoi'a  in  the  new  design , P>J  All 
generators  which  contain  an  odd  number  of  diameters 
in  the  original  designs  will  be  repivduoed  containing 
the  extm  character  X as  genemtors  in  the  new  design. 

Thus,  in  our  example,  when  the  two  Resolution  III  designs  are 
combined,  the  result  is  a Resolution  IV  design  with  the 
following  defining  generators: 

(I)  » ABDH  <=  ACER  * BCI'H  » ABCG . 

The  defining  contrasts  (or  defining  relations  as  Box  and 
Hunter  call  them)  are  obtained  by  expanding  the  defining  gen- 
erators by  multiplying  all  combinations  of  the  original 
generators  in  pairs,  triplets,  and  so  forth.  For  the  above 
example,  tho  complete  set  of  defining  contrasts  would  be; 

1 2 3 4 12  13  14 

(I)  - ABDH  » ACEH  = BCFH  = ABCG  » BCDE  * ACDF  *=  CDGH  = 


t 


23  24  34  123  124  134  234  1234 

ABEF  = BEGH  * AFGH  « DEFH  « ADEG  =>  BDFG  = CEFG  « ABCDEFGH 


where  tho  numbers  above  each  contrast  indicate  which  of  the 
defining  generators  (underlined)  it  is  a product.  Since  the 
resolution  of  the  design  can  be  determined  by  the  number  of 


•/'tiffed 


letters  in  the  smallest  defining  contrast,  it  is  apparent 
that  the  two  Resolution  III  designs,  when  combined,  form  a 
Resolution  IV  plan. 

Plackett  and  Burman  designs.  Resolution  IV  designs  also 
can  be  made  from  the  Plackett  and  Burman  (1946;  also  see  Simon, 
1973,  pp  102-104)  Resolution  III  designs  by  adding  an  addi- 
tional "foldover"  block.  One  advantage  of  using  those 
designs  would  be  the  extra  economy  achieved  as  the  number  of. 
"actors  to  be  studied  increases.  This  economy  derives  from 
the  fact  that  the  Plackett  and  Burman  designs  can  be  con- 
structed by  restricting  the  number  of  experimental  conditions 
to  some  multiple  of  four.  The  Box  and  Hunter  designs,  on  the 
other  hand,  require  that  the  number  of  experimental  conditions 
be  restricted  to  some  power  of  two.  Thus,  if  one  wished  a 
Resolution  IV  design  for  fifty  factors,  the  Box  and  Hunter 
designs  would  require  two  Resolution  III  blocks  of  64  (or  128) 
experimental  conditions  while  Plackett  and  Burman  designs 
would  require  two  blocks  of  52  (or  104)  experimental  condi- 
tions. Another  advantage  of  Plackett  and  Burman  designs  for 
screening  purposes  was  noted  by  Tukey  (1960,  p 171),  who  found 
that  the  degree  of  confounding  between  main  and  two-factor 
interaction  effects  in  the  Resolution  III  Plackett  and  Burman 
plans  was  quite  low  in  many  cases  (and  much  less  than  the 
fully  aliased  conditions  in  the  Box  and  Hunter  designs) . 
Estimating  the  relative  strength  of  main  effects  with  the 
Plackett-Burman  designs  before  continuing  to  the  foldover  block 
might,  therefore,  be  done  with  greater  confidence.  Neither 
the  Plackett-Burman  designs  nor  their  potential  applications 
will  be  discussed  further  in  this  report.  The  reader,  however, 
should  consider  using  them  if  they  fit  his  problem. 
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Complete  Resolution  IV  Designs 

The  designs  proposed  in  this  report  do  not  provide  for  a 
progressive  data-collection  plan  in  which  a Resolution  III 
design  is  used  first  to  investigate  the  linear  effects 
(aliased  with  all  higher  order  effects)  to  be  followed  by  a 
second  block  to  isolate  main  and  two-factor  interaction 
effects.  Instead,  with  these  designs,  it  is  presumed  that 
the  isolation  of  main  and  two-factor  interaction  effects  is 
an  absolute  requirement  for  screening  purposes  and  so  all  the 
data  for  that  purpose  is  collected  at  one  time. 

Box  and  Hunter  (1961,  p 341)  note  that  a Resolution  IV 
design  can  be  constructed  directly  "by  first  writing  down  the 
sign  matrix  for  a two-level  factorial  and  then  associating 
new  variables  with  all  interact  on  columns  having  an  odd 
number  of  [letters]."  Thus,  a 16-observaf.ion  Resolution  IV 
design  can  be  derived  from  a 21*  factorial  plan  by  aliasing 
four  new  factor  labels  (e.g.,  E,F,G,  and  H)  to  the  four  three- 
factor  interactions  (i.e.,  ABC,  ABD,  ACD,  and  BCD)  in  the 
original  plan.  By  the  proper  assignment  of  new  factor  labels, 
this  design  can  be  made  equivalent  to  the  design  made  from 
the  principal  fraction  plus  foldover  Resolution  III  designs 
described  in  the  previous  section. 

The  reader  should  be  aware  by  this  time  of  a number  of 
characteristics  common  to  all  of  these  methods.  The  sign 
matrix  for  any  design  formed  from  a factorial  plan  is 
arranged  so  that  row  coefficients  are  orthogonal  among  them- 
selves, as  are  column  coefficients  among  themselves.*  With 
rows  representing  the  independent  experimental  conditions, 


* 

With  the  plus  and  minus  signs  actually  representing 
plus  and  minus  ones,  orthogonality  between  any  pair  of  columns 
can  be  checked  by  obtaining  the  cross-product  sum  between 
columns,  which  must  equal  zero.  The  same  is  true  with  rows. 
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sources  of  variance  can  be  assigned  to  the  columns  in  various 
combinations.  A column  may  be  labeled  a main  effect  or  an 
interaction,  or  as  with  saturated  designs,  a string  of  inter- 
actions. However,  whatever  label  is  assigned  to  a column, 
since  columns  are  orthogonal,  we  may  be  certain  that  an  effect 
measured  in  any  one  column  will  be  independent  of  an  effect 
measured  in  any  other  column.  Thus,  we  may  label  the  columns 
as  we  please,  as  long  as  we  are  careful  to  see  that  labels 
for  the  main  effects  and  those  for  their  interactions  are 
assigned  consistently  with  the  requirem  ts  of  the  sign  matrix. 
With  these  principles  in  mind,  a screening  design  robust  to 
trend  can  be  created. 

Resolution  IV  Designs  Robust  to  Trend 

Two  steps  are  required  to  construct  the  designs  provided 
in  this  report.  The  first  is  to  construct  a quasi-saturated 
fractional  factorial  that  will  be  suitable  for  screening 
purposes.  The  second  is  to  adapt  it  so  as  to  take  advantage 
of  its  trend-resistant  characteristics. 

We  begin  to  construct  the  design  by  first  determining 
the  design  size  which  depends  on  the  number  of  factors  being 
investigated.  The  rule  is: 

The  number  of  experimental  conditions  required 
is  the  nearest  power  of  two  (2k)  that  is  equal  to 
or  greater  than  twice  the  number  of  factors  to 
be  studied. 

For  example,  we  wish  to  study  20  factors.  Two  times 
twenty  equals  40.  The  nearest  2k  equal  to  or  greater  than  40 
is  2 6 = 64  conditions.  Or,  perhaps  we  wish  to  study  8 

]r 

factors.  Eight  times  two  equals  16.  The  nearest  2 value 
is  2 4 = 16  conditions. 
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For  this  example  we  shall  construct  a screening  design  to 

study  eight  factors.  First  it  is  necessary  to  lay  out  the 

k 

sign  matrix  for  a complete  2 factorial  design.  For  this 
example  we  use  a sign  matrix  for  a 2 4 factorial  design.  There 
would  be  16  (N)  experimental  conditions,  arranged  in  the  Yates' 
(1937)  "standard  order,"  capable  of  estimating  the  following 
(N~l=*15)  effects,  also  arranged  here  in  the  standard  order: 

A,B, AB,C, AC,BC, ABC, D, AD, BD, ABD,CD, ACD,BCD, ABCD 

plus  the  mean  (I) . These  are  referred  to  in  this  paper  as 
the  "old"  or  the  "original  factorial"  labels. 

Rearranging  the  columns.  We  rearrange  the  column  of 
signs  by  moving  all  columns  with  labels  that  include  Factor  A* 
to  the  left  and  all  remaining  columns  to  the  right.  The 
effects  with  Factor  A are  then  ordered  from  the  largest  to 
the  smallest  interactions  followed  by  the  main  effect,  A. 

Also,  within  any  order  of  interaction,  they  would  be  arranged 
alphabetically.  For  example,  this  would  be: 

Alphabetical  Alphabetical 

ABCD)  ABC,  ABD,  ACD)  AB,  AC,  AD)  A (New  labels) 

4 3 3 3 2221  (Size  of  effect) 

The  reason  for  this  particular  arrangement  will  be  more 
evident  later.  These  steps  can  be  followed  from  here  on  by 

examining  tho  completed  design  in  Table  1. 


♦Selecting  Factor  A for  this  purpose  is  arbitrary. 

Later,  in  order  to  find  columns  that  are  robust  to  trends  and 
also  require  few  factor  level  changes,  it  may  be  necessary  to 
use  a different  factor. 
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iHRtE-FACTOR  INTERACTION  STRINGS  ALIASED  WITH  MAIN  EFFECTS  ARE  LISTED  IN  APPENDIX  I-A. 

Two-factor  interaction  strings  aliased  with  two-factor  ikter action  labels  listed  in  Appendix  I-B 
, Inner-product  sums  listed  in  Appendix  I-C. 

Blank  spaces  represent  zero  percent.  Spaces  with  zeroes  represent  sohe  percent  smaller  than  11. 


Next,  we  assign  "new  screening"  labels,  i.e.,  the 
letters  from  A to  H (for  the  eight  factors  in  our  design) , 
to  the  rearranged  columns  which  still  bear  the  old  factorial 
labels,  thus: 


New  labels  (Screening  Design) > 


A B C D EPGH 


Original  labals  (Factorial  Design) ; ABCD,ABC,ABD,ACD,AB,AC,AD,A 

These  are  not  aliases  in  the  usual  sense;  instead  they  are 
merely  associations  that  occur  from  the  relabeling.  To  mini- 
mize confusion,  all  original  factorial  labels,  hereafter, 
will  be  underlined. 

We  must  next  arrange  the  columns  in  which  Factor  A is 
not  present  in  the  original  factorial  labels.  This  is  done 
by  first  arranging  the  columns  from  left  to  right  according 
to  the  order  of  the  old  labels  (from  the  highest  to  the 
lowest  interaction  and  then  the  main  effects) , and  within 
each  order,  arrange  the  effects  alphabetically.  In  our 
example,  the  columns  would  be  arranged  like  this: 


Alphabetical 

BCD,  BC,  BD,  CD,  B,  C,  D, 
3 2 2 2 1 1 1 


(Old  label) 

(Number  of  factors  involved) 


There  is  one  less  term  than  there  was  in  the  previous  set 
with  the  Factor  A.  The  missing  column  is  the  Identity  column, 
(I) . 


Next  we  must  associate  new  screening  labels  with  these 
old  ones.  All  new  ones  will  be  two-factor  interactions  of  the 
xA  to  H new  labels  given  to  the  other  set.  It  happens  that 
when  columns  are  arranged  so  that  their  original  factorial 
labels  are  as  shown  above,  new  label  two-factor  interactions 
including  A will  be  arranged  in  reverse  alphabetical  order  thus; 

AH,  AG,  AF,  AE,  AD,  AC,  AB 
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This  makes  the  column  of  the  original  label  BCD  the 
column  of  the  new  label  AH;  the  column  with  the  original 
label  BC  is  now  the  column  with  the  new  label  AG,  and  so 
forth.  The  complete  association  across  all  16  columns  then 
would  be: 


New:  (I)  A B C D E F G H AH  AG  AF  AE  AD  AC  AB 

Old:  (I) ,ABCD,ABC,ABD,ACD,AB,AC,AD,A,BCP,BC,BD,CD,  U,  C,  D 


To  show  that  the  column  associated  with  both  AH  and  BCD 
(new  and  old  labels)  is  the  appropriate  one  for  the  interaction 
between  the  columns  associated  with  A or  ABCD  and  H or  A,  we 
multiply  new  and  old  at  the  same  time.  The  associations  remain 
consistent,  thus: 


New 

A 

Multiplied  by  H 
Yields  AH 


Old 

ABCD 

A 

BCD 


This  would  be  true  with  any  of  the  other  combinations.  With 
the  new  labels,  the  2 4 factorial  design  has  been  turned  into 
a 28”4  screening  design,  since  all  main  effects,  being  in 
different  columns,  are  orthogonal  to  themselves  and  to  all 
two-factor  interactions. 


The  next  step  is  to  find  the  aliases  within  the  strings 
of  two-factor  interactions.  The  simplest  procedure  is  to 
continue  the  pairing  of  factors,  this  time  beginning  with  B, 
i.e.,  BH,  BG,  BF,  BE,  BD,  BC,  and  not  repeating  any  previously 
used  pair,  e.g,,  BA«*AB.  This  makes  the  number  of  pairs  get 
smaller  each  time  around,  i.e.,  AH  to  AB,  BH  to  BC,  CH  to  CD, 

DH  to  DE,  EH  to  EF,  FH  to  FG,  and  GH.  There  will  be  k(k-l)/2 
combinations  for  K factors.  For  the  fully  quasi-saturated 
design,  each  string  of  two-factor  interactions  will  contain  k/2 
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interactions.  For  example,  AH  would  be  aliased  with  (in  this 
example)  DE  (since  ACD  x AB  *»  BCD) ; CF  (since  ABD  x AC  = BCD) ; 
and  BG  (since  ABC  x AD  * BCD) . Aliases  are  provided  for  the 
designs  given  in  this  report.  A computer  program  for  iden- 
tifying aliases,  prepared  by  Mr.  Howard  Lee,  is  given  in 
Appendix  IV, 
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Identifying  the  experimental  conditions.  The  columns, 
along  with  their  old  and  new  labels,  have  been  rearranged. 

For  the  old  labels,  the  names  of  the  experimental  conditions 
remain  the  same.  For  the  new  labels,  new  names  of  the  ex- 
perimental conditions  must  be  obtained.  This  can  be  done  with 
the  newly  arranged  sign  matrix.  Each  row  is  a different  (and 
independent)  experimental  condition.  The  "name"  of  each 
experimental  condition  can  be  obtained  by  writing  down  a 
letter  corresponding  to  each  new  label  main  effect  in  the 
rearranged  design  that  has  a plus  sign  under  it  in  the  par- 
ticular row.  It  is  conventional  to  write  the  names  of 
experimental  conditions  in  small  letters  leaving  capital 
letters  for  the  names  or  labels  of  the  effects  of  the  columns. 
For  example,  if  the  first  row  of  the  sign  matrix  looked  like 
this  after  the  rearrangement: 

New  labels:  (1)  ABCDEFGHAH  AG  etc 

Signs  : + + --  - + * + - - + etc 

then  the  experimental  condition  associated  with  that  row 
would  be: 


aofg 

since  the  letters  correspond  to  those  of  the  main  effects 
with  + signs  in  their  columns. 

Identifying  the  trend-robust  columns.  The  reasons  for 
the  particular  column  arrangement  described  above  will  now 
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become  more  evident.  The  general  idea  on  which  this  is 
based  came  from  a paper  by  Daniel  and  Wilcoxin  (1966,  p 261; 
also  see  Simon,  1973,  pp  121-128)  who  noted  that; 

. . . certain  of  the  ordered  contrasts  appearing 
in  the  <P  system  are  orthogonal  to  linear  and  to 
quadratic  trends.  Some  other  contrasts  are 
nearly  orthogonal  and  some  are  rather  heavily 
correlated  with  first  and  second  order  trend. 

The  design  problem  is,  then,  to  choose  those  sets 
of  ordered  contrasts  that  provide  efficient 
estimation  of  all  desired  effects  and  interactions. 

What  they  are  saying  is  that  certain  columns,  (i.e,  the  vertical 
sequences  of  plus  or  minus  coefficients,  in  a sign  matrix 
of  a two-level  factorial  or  fractional  factorial  experimental 
design)  arranged  with  the  experimental  conditions  in  standard 
order,  correlate  zero  or  very  little  with  a set  of  coeffic- 
ients representing  a linear  or  a quadratic  trend.  The  same 
is  true  for  cubic  trends,  which  Daniel  and  Wilcoxin  did  not 
consider  in  their  paper.  The  investigator  would  want  to 
assign  the  more  important  factors  to  the  column  most  robust 
to  trend  so  that  estimated  effects  would  not  be  distorted. 

Other  methods  (see  Simon,  1973)  for  handling  sequence 
effects  have  been  proposed.  Some  involve  making  multiple 
measures  of  each  condition  and  arranging  them  in  sequences 
that  eventually  are  balanced  against  trends.  Some  methods 
require  a large  number  of  repeated  measures  in  which  the 
effects  have  been  introduced  randomly  and  the  trend  effects 
isolated  by  means  of  statistical  techniques.  Both  approaches 
involve  far  more  data  collection  than  is  usually  justified 
during  the  early  screening  process.  The  method  proposed  by 
Daniel  and  Wilcoxin  (1966)  provides  the  most  economical  solu- 
tion by  taking  advantage  of  the  natural  robustness  to  trend 
of  2 ^ or  2 designs,  unreplicated. 
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To  determine  the  degree  to  which  each  column  of  our 
screening  design  is  robust  to  linear,  quadratic,  and  cubic 
trend  effects,  we  must  correlate  the  plus  and  minus  (one) 
coefficients  in  each  column  of  the  sign  matrix  with  the  appro- 
priate integer  Tchebycheff  orthogonal  polynomial  coefficients 
(Fisher  and  Yates,  1963;  Beyer,  1966;  DeLury,  1950). 

Let  us  illustrate  this  with  the  column  for  Factor  G in 
the  28”4  screening  design  (Table  1),  originally  labeled 
Interaction  AD  in  the  factorial  plan.  The  ordered  column 
vector  of  coefficients  (without  the  ones)  for  Factor  G,  and 
the  ordered  Tchebycheff  coefficients  for  linear,  quadratic, 
and  cubic  trends  are  shown  in  Table  2.  The  correlation  (r) 
between  linear  (L)  trends  and  Factor  G is  obtained  thus: 

rLG=  / (£LQIi- 
V (EL2)  (EGG) 

where  ELG2is  the  sum  of  the  cross  products  (or  inter-product 

sum)  between  each  pair  of  effect  and 
linear  trend  coefficients 

ELL  is  the  sum  of  the  linear  trend  coefficients,  each 
squared 

EGG  is  the  sum  of  the  squared  coefficients  for  Factor  G 
(which  will  equal  N in  these  designs) 

Thus  to  calculate  the  values  needed  to  solve  the  equation, 
from  the  data  in  Table  2,  we  do  the  following: 

EW3=(-15)  (+l)+(-13)  (-1)  + (-11)  (+1)+. . . (+13)  (-l)+(+15)  (+1)*=0 

EGG=  (+1)2=  (-1) 2 + (+1)2  ...  (-1)2  + (+1)2  = 16 

Elli=  (-151 2 + (-13) 2 = (-11)2  ...  (+13) 2 f (+15) 2 - 1,360 


TABLE  2 

ILLUSTRATING  USE  OP  TCHEBYCHEFF ' S COEFFICIENTS  TO 
CALCULATE  INNER-PRODUCT  SUMS  AND  SUM  OF  SQUARES 


Factor  G 
Coefficient* 


TCHEBYCHEFF ' S COEFFICIENTS 
Linear  Quadratic  Cubic 


-455 

- 91 
+143 
+ 267 
+ 301 
+ 265 
+179 
+ 63 

- 63 
-179 
-265 
-301 
-267 
-143 
+ 91 
+455 


1007760 


*Plus  or  minus  signs  represent  coefficients  of  +1  and  -1 
respectively. 


t 


Substituting  these  values  in  the  equation,  we  get: 


* 


LG 


-/ 


16  x 1260 


21760 


= 0 


• With  a zero  correlation,  an  estimated  effect  of  Factor  G 

would  be  totally  unaffected  if  an  unwanted  linear  trend 
effect  was  running  through  the  data. 

© Repeating  the  process  for  the  quadratic  trend  and 

Factor  G we  get: 

i 

EQG=(+35) (+l)+(+21) (-1)  + (+9) (+1)+... (+21) (-l)+(+35) (+1)=64 
© EQQ=(+35)2  +{+21) 2 + (+9) 2 +...{+21) 2 +{+35) 2 =5712 

EGG=N  = 16 


1 ! 

f \ 


Substituting  in  the  equation,  we  get: 

€ 

rQG  =/l6  X 5711  =/  91392  = * 044818  = •2U7 

U The  percentage  of  overlap  between  the  quadratic  trend  and 

the  effect  of  Factor  G is,  therefore: 


€ 


C 


C 


c 


*„  51  (r„J2  X 100  = ( . 2117) 2 x 100  = .0448  x 100  = 4.5 
QG  QG 

The  correlation  between  Factor  G and  the  cubic  trend 
effect  was  zero. 

To  discover  which  columns  are  the  most  robust  to  trends, 
this  process  is  repeated  for  all  relationships  between 
linear,  quadratic,  and  cubic  trend  effects  and  the  experi- 
mental effects  (main  and  two-factor  interaction  strings) . 
However,  these  calculations  are  supplied  for  the  designs 
given  in  this  report. 
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Daniel  and  Wilcoxin  (1966,  pp  269-270)  point  out  how 
Yates'  (1937)  algorithm,  when  applied  directly  to  the  Tcheby- 
cheff  coefficients,  can  be  used  to  calculate  the  innerproduct 
sums  more  easily  than  if  these  were  obtained  a column  at  a time. 

When  all  of  the  effects  for  any  design  are  correlated,  the 
relationships  show  two  distinct  patterns.  For  one,  referring  to 
the  original  factorial  labels,  certain  types  of  sources  are 
always  correlated  with  particular  trend  effects.  Thus: 


Four-factor  interactions 
and  higher 

Three-factor  interactions 


Two-factor  interactions 
Main  effects 


Uncorrelated  with  L,  Q,  or  K trends* 

Correlated  with  cubic  but  not  with 
linear  or  quadratic 

Correlated  only  with  quadratic 

Correlated  with  linear  and  cubic 
but  not  quadratic 


A second  pattern  is  also  apparent.  Within  any  set  of  effects 
of  the  same  order,  if  they  exist  at  all,  the  correlations 
increase  (using  the  labels  of  the  original  factorial)  as  the 
factors  progress  alphabetically.  Thus,  the  AC  interaction 
would  be  more  correlated  with  a quadratic  effect  than  the  AB 
interaction,  and  so  forth.  Both  of  these  patterns  can  be 
seen  in  the  2 ^ design  (Table  1),  but  they  become  even 
clearer  with  larger  designs. 

It  should  be  clearer  now  why  the  columns  of  the  screen- 
ing design  are  reordered  as  they  are.  It  allows  main 
effects  (new  labels)  to  be  assigned  to  the  columns  less  cor- 
related with  trends  and  the  two-factor  interaction  strings 
to  be  assigned  to  columns  more  correlated  with  trend  effects. 
For  screening  purposes,  this  greater  emphasis  on  keeping 
main  effects  clean  is  appropriate.  The  column  reordering 


The  letter  K is  used  to  represent  the  cubic  trend  to 
avoid  confusion  when  the  letter  C is  used  to  represent  an 
experimental  factor. 


also  tends  to  place  the  least  correlated  within  these  two 
groups  more  to  the  left  of  the  design,  facilitating  its  use. 

Since  this  general  pattern  is  not  completely  correct, 
with  each  of  the  designs  given  in  this  report  the  percentage 
overlap  (=»  r x 100)  between  each  factor  and  trend  combina- 
tion is  provided.  The  investigator  can  use  these  when  he 
must  decide  how  to  assign  real-world  factors  to  the  design 
columns. 

Counting  factor-level  changes.  One  can  merely  count  the 
number  of  times  any  column  requires  a change  of  factor  levels 
For  example,  in  the  2®v1 * * 4  design  (Table  1),  in  column  AB  the 
level  is  changed  only  once,  from  low  to  high  between  the 
eighth  and  ninth  trial,  while  in  column  H,  the  levels  are 
changed  fifteen  times,  every  other  trial.  Within  each  design 
the  number  of  times  the  factor  level  changes  (the  count)  is 
a different  value  in  each  column,  from  one  to  N-l  for  N exper 

• V«-n 

imental  conditions  (and  N-l  effects)  in  each  2 ^ design. 

As  the  designs  get  bigger,  it  may  be  inconvenient,  as 
well  as  time-consuming  to  count  the  changes  in  each  column. 
The  following  algorithm  can  be  used  instead: 

1.  Using  the  original  factorial  labels,  with  ex- 
perimental conditions  in  Standard  Order, 
determine  the  counts  for  the  main  effects. 

If  the  letters  for  the  main  effects  are 
written  in  reverse  alphabetic  order,  the 
count  for  each  will  be: 

(2k  - 1) 

where  k is  the  position  of  the  main  effect 
in  the  reverse  order  sequence: 

e.g.,  in  a three-factor  design,  the 
main  effects  are  A,  B,  and  C.  In 
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reverse  order  they  are  C,  B,  and  A, 
in  positions  1,  2,  and  3 respectively. 

Their  factor  level  change  count  would 
therefore  be: 

C:  21  - 1 = 1 
B;  2 2 - 1 = 3 
A:  2s  - 1 = 7 

2.  To  determine  the  count  for  any  interaction,  the 
counts  for  the  individual  main  effects  are 
combined  always  as:  plus,  minus,  plus,  minus, 
etc,  starting  with  plus  and  going  as  far  as 
necessary; 

e.g.,  the  count  for  the  interaction  ABC  — 
the  letters  must  be  ordered  alphabetically  — 
would  be: 

ABC 
+ 7-  3 + l = 5 

or  for  BC: 

B C 

+ 3-1  = 2 

or  for  AB : 

A B 

+ 7-3  = 4 

of  course,  the  count  can  be  simplified 
since  ABC  would  also  equal: 

AB  + C 
+ 4 +1=5 
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PREPARING  TO  USE  SCREENING  DESIGNS 

It  takes  more  to  properly  design  an  experiment  than  to 
describe  the  experimental  design.  Screening  designs  tell 
us  at  what  coordinates  in  the  abstract  experimental  space 
we  should  sample  performance  to  obtain  information  regarding 
main  effects  without  bias  from  two- factor  interaction 
effects.  However,  the  investigator  has  more  to  do  if  he 
wishes  to  use  these  designs  effectively. 

Pre-analvsis  to  Select  the  Experimental  Factors 


Before  he  selects  the  final  set  of  factors  to  be 
included  in  the  screening  study,  the  investigator  should 
prepare  an  unrestricted  list  of  factors  which  reasonable 
and  knowledgeable  experts  believe  may  have  a non-trivial 
influence  on  the  real-world  task  most  of  the  time.  This 
first  step  is  designed  to  make  certain  that  any  source 
likely  to  influence  the  performance  of  the  task  under  in- 
vestigation be  listed  for  consideration,  whether  it  be 
related  to  the  equipment,  subject,  environment,  or  task. 

The  value  of  this  exercise  is  to  reduce  omissions  too  early 
in  the  effort  because  of  practical  considerations*  real  or 
imagined,  at  that  time.  This,  of  course,  is  no  license  to 
list  every  factor  imaginable,  but  any  that  are  likely  to 
influence  the  performance  at  hand  should  be  included  in  this 
initial  step. 

The  second  step  is  to  define  the  task,  with  emphasis  on 
the  conditions  in  the  real  world.  This  includes  an  opera- 
tional definition  of  the  performance  measure  (and  more  likely 
measures)  that  will  be  employed,  as  well  as  the  nature  of  the 
stimuli  and  responses  of  the  specific  situation.  While  this 
short  statement  does  not  do  justice  to  the  care  required 
and  the  importance  of  this  requirement,  the  matter  will  not 
be  discussed  any  further  in  this  report. 


The  third  step  is  to  decide  what  real-world  values  to 
set  at  the  upper  (+1)  and  lower  (-1)  limits  of  each  factor. 
These  values  should  be  selected,  based  on  the  following 
considerations i 

1.  Limits  likely  to  be  experienced  in  the  real 
world  for  the  task  under  consideration. 

2.  Limits  set  by  the  state-of-the-art  in 
the  real  world. 

3.  Limits  set  by  the  state-of-the-art  in 
simulation,  which  may  be  beyond  those  in 
the  real  world  so  that  information  re- 
garding future  systems  can  be  collected. 

4.  Limits  set  by  construction  costs,  where 
the  information  lost  is  not  considered 
critical. 

5.  Limits  set  by  manipulation  difficulties, 
where  the  information  lost  is  not  con- 
sidered critical . 

6.  Limits  that  are  likely  to  approximate  the 
points  at  which  the  highs  and  lows  of  per- 
formance will  occur.  (This  is  particularly 
important  when  the  function  between  the 
factor  and  performance  is  probably  U-shaped.) 

The  limits  selected  obviously  affect,  to  some  extent,  how 
critically  a particular  factor  will  appear  to  affect  per- 
formance. If  the  limits  are  too  narrow,  performance  may 
change  little  and  an  investigator  may  read  this  (incor- 
rectly) as  meaning  the  factor  has  a trivial  effect  on 


* 

t 


1 


S3 

I 


21 


■ .*  ■|,!.y,,y,‘,j.l,i  y;  ^ 11  j,  4 J1J  !,j  V'1 


performance,  when  in  fact  it  is  true  only  within  the  limits 
being  studied.  Had  the  limits  been  set  wider,  the  effect 
would  be  greater.  This  is  why  setting  the  limit  values 
should  be  determined  by  real-world  interests,  so  that  effects 
are  measured  under  conditions  of  practical  interest  in 

the  operational  situation.  Do  not  do  as  one  eminent  psy- 
chologist did  when  he  failed  to  get  an  effect  from  some  factor. 
That  is,  do  not  expand  the  range  in  the  simulation  beyond  anything 
likely  to  be  found  in  reality  so  that  the  factor  would  show 
a significant  effect. 

The  fourth  step  is  to  assign  priorities  to  the  original 
list  of  variables  based  on  a number  of  considerations: 

1,  Order  the  factors  on  a five-point  scale  (if 
possible)  according  to  how  much  each  — 
within  its  specified  limits  — is  likely 
to  affect  the  performance  on  the  particular 
task. 

2.  Indicate  those  factors  in  which  the  inter- 
ested parties  (e.g.,  contracting  agency  and 
the  investigator)  have  a special  interest. 
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3.  Indicate  those  factors  that  are  expensive 
to  simulate. 

4.  Indicate  those  factors  that  are  likely  to 
interact  with  one  another,  noting  particu- 
larly the  ones  likely  to  result  in  disordinal 
interactions . 


The  investigator  must  weigh  these  subjectively  to  select  the 
final  set  of  the  factors  for  the  experiment.  The  listing 
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exercise  provides  him  with  a better  overview  when  making  hi3 
decision.  Ultimately,  he  must  consider  how  his  decision 
affects  the  experiment's  capacity  to  reflect  reality  for  the 
task  under  consideration. 

Pre-analysis  to  Facilitate  the  Use  of  Screening  Designs 


Once  the  factors  have  been  selected,  the  next  step  is  to  % 

& 

anticipate  how  they  will  fit  into  a screening  study.  This 
can  be  a mixed  process  of  analysis  and  empirical  data  collec- 
tion. However,  pre-analysis  is  always  desirable,  whether  or 
not  it  is  to  be  followed  by  preliminary  or  formal  data 
collection,  for  it  can  show  a priori  that  certain  effects, 
observed  later,  were  anticipated.  An  anticipated  disordinal 
interaction  can  be  accepted  as  real  with  greater  confidence 
when  found  in  the  data  than  one  that  was  not  anticipated. 


The  investigator  should  make  the  following  analyses  as 
an  aid  to  using  the  screening  design: 

* 

1 . Classify  the  factors  according  to  their  quantita-  ' 

tive  characteristics:  ordercd-continuous ; ordered- 
discrete;  ordered-complex-categorical  (by  choice) ; 
categorical.  ; 

! 

This  provides  a preview  of  design  characteristics  ' 

needed  to  handle  each  factor.  Ordered  factors  can  j 

eventually  become  part  of  a response  surface,  and  j 

may  be  assumed  continuous  for  certain  applications,  i 

but  all  levels  of  the  factor  may  not  be  available 
as  a design  data  collection  point.  On  the  other 
hand,  complex  factors  which  are  treated  as  categor- 
ical ones  but  which  are  in  fact  a particular  com- 
bination of  ordered  and  continuous  factors,  may  be 
redefined  according  to  these  parameters.  Most 
economical  multifactor  designs  can  be  used  more 
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effectively  with  ordered  and  continuous  factors; 
fewer  data  points  generally  have  to  be  taken  and 
the  chances  that  greater-than-second-order  inter- 
actions are  non-trivial  are  small. 

For  the  ordered  factors, 

2.  Estimate  the  response  function  between  the  given 
limits  of  each  factor.  Four  functions  are  of 
major  interest:  linear,  quadratic,  U-or  negatively 
accelerated  growth  pattern,  cubic  or  S-shaped. 

This  will  aid  in  deciding  how  complex  a model  may 
be  needed  to  approximate  the  response  surface, 

how  many  levels  will  be  needed  to  approximate  the 
individual  functions,  and  where  the  limiting  data 
points  must  be  located. 

3.  Decide  what  measurement  scale  might  be  used  to 
simplify  any  non-linear  function  that  was  antici- 
pated. This  helps  meet  the  requirements  for  a 
lower-order  response  surface  when  economical  multi- 
factor designs  are  used. 

4 . Attempt  to  draw  the  interaction  effects  that  are 
considered  important,  and  consider  the  scaling 
that  would  eliminate  the  ordinal  interactions. 

Pre-experiment  Data  Collection 

Certain  information  can  only  be  obtained  empirically. 
Some  data  might  be  collected,  if  deemed  important  by  the 
investigator,  to  make  a quick  but  tentative  check  on  assump- 
tions made  in  the  foregoing  analysis.  Other  information, 
however,  is  vital  if  the  screening  designs  are  to  be  used, 
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and  should  be  collected,  in  fact,  prior  to  any  experiment. 

The  most  important  ones  are: 

1.  Test  the  trial-to-trial  reliability  for  a single 
condition.  (Reliability  Test) 

Test  a typical  subject  on  five  or  more  consecutive  trials 
of  a single  condition.  Does  performance  from  trial  to  trial 
vary  irregularly  and  excessively  (see  Figure  i-A) ? If  so, 
this  suggests  that  some  critical  source  of  variance  has  not 
been  identified  and/or  is  not  under  control,  and  should  be. 

Are  there  signs  of  a progressive  trend  effect  over  the  five 
trials  (see  Figure  1-B) ? This  suggests  that  the  subject 
might  not  be  sufficiently  familiarized  with  the  task  or  the 
experimental  apparatus.  Either  more  practice,  trend  isola- 
tion techniques,  or  both  may  have  to  be  employed.  Is  there 
an  immediate  improvement  in  performance  and  then  a leveling 
off  (see  Figure  1-C) ? This  suggests  that  some  precautions 
need  be  taken  to  offset  momentary  perturbations  as  each  new 
trial  condition  is  introduced. 


Figure  1.  Examples  of  Trial-to-Trial  Performance  Variability 
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Differences  in  the  extent  of  trial-to-trial  variance 
under  easy  and  difficult  conditions  with  experienced  and 
inexperienced  subjects  provide  clues  to  the  need  for  proper 
response  scaling  and  other  variance-control  mechanisms. 

2 . Test  for  subject-to-subject  variability  within 
presumably  homogeneous  groups.  (Subject  Hetero- 
geneity Test) 

On  both  easy  and  difficult  experimental  conditions,  a 
number  of  presumed  equivalent  subjects  should  be  tested.  If 
their  performance  differs  considerably,  then  one  may  suspect 
that  critical  subject  characteristics  are  being  ignored. 

Quite  often,  subjects  are  considered  homogeneous  according  to 
some  simple  label,  but  are  not  so  insofar  as  their  performance 
is  concerned.  This  test  provides  some  clues  as  to  whether 
those  subject  factors  should  be  measured  or  controlled  in  the 
experiment. 

When  faced  with  the  need  to  introduce  a new  subject 
characteristic  as  a dimension  of  a screening  design,  the 
investigator  must  consider  the  nature  of  the  characteristic. 

If  the  characteristic  is  simple  and  readily  quantifiable 
(e.g.,  visual  acuity),  then  it  probably  should  be  introduced 
into  the  experimental  design  as  any  other  factor.  This 
means  that  subjects  within  different  levels  of  visual  acuity  — 
two  levels  for  a screening  study  — would  be  used,  each 
performing  a particular  combination  of  the  levels  of  the 
remaining  factors  representing  the  experimental  condition. 

If  the  characteristic  is  complex  and  difficult  to  quantify 
(e.g.,  pilot  experience),  initially  it  might  be  better  to 
run  subjects  representing  each  of  the  two  levels  on  every 
experimental  condition.  This  permits  a subject-by-factor 
interaction,  if  it  exists,  to  be  detected. 
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3.  Test  to  determine  whether  there  are  conditions 
which  can  be  performed  perfectly  or  can*t  be 
performed  at  all  by  most  subjects. 

When  too  many  experimental  conditions  are  too  difficult 
or  too  easy,  the  information  provided  by  a screening  design 
is  severely  limited.  An  investigator  may  have  to  "live  with 
it,"  or  he  may  find  that  by  making  slight  adjustments  in  the 
range  of  a few  factors,  he  can  eliminate  these  uninformative 
upper  and  lower  limits.  This,  however,  should  never  take 
priority  over  practical  interests  and  the  reality  of  the 
situation. 


4 . Test  a very  good  and  a very  poor  subject  on  the 
easiest  and  most  difficult  tasks.  (Interaction 
Test) . 
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How  performance  is  distributed  among  these  four  condi- 
tions provides valuable  clues  regarding  the  task,  its  range 
of  difficultj.es,  and  the  scaling  of  the  dependent  variable. 

Four  types  of  solutions  are  shown  in  Figure  2.  In  Figure 
2 -A  no  interaction  is  present,  while  in  Figure  2-B  an  im- 
portant type  of  disordinal  interaction  is  shown,  warning  of 
the  presence  of  interactions  that  the  unaugmented  screening 
design  is  poorly  equipped  to  handle.  Figures  2-C  and  2-D 
suggest  the  presence  of  ceiling  and  floor  effects,  respectively, 
which  may  be  reduced  through  appropriate  scaling. 


Selecting  the  Screening  Design 
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Two  major  considerations  in  selecting  a screening  design 


are: 


1)  whether  or  not  one  wishes  to  isolate  main 
and  two-factor  interaction  effects  immed- 
iately before  examining  part  of  the  data; 
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Figure  2 


Examples  of  Types  of  Interactions  Including  No  Interaction 


Resolution  III  or  IV  designs.  It  would  be  unusual  for 
an  investigator  performing  human  factors  engineering  experi- 
ments (or  any  behavioral  science  study)  not  to  want  to  isolate 
main  from  two- factor  interaction  effects.  Two-factor  interactions 
occur  too  frequently  to  risk  their  distorting  main  effects,  even 
in  a screening  study.  On  that  basis,  an  investigator  may 
wish  to  use  a Resolution  IV  design  from  the  beginning  with- 
out resorting  to  blocking.  The  added  advantage  of  the 
Resolution  IV  design  is  robustness  to  trends  (more  so  than  a 
Resolution  III  design) . 

As  more  factors  are  to  be  investigated  and  the  cost  of 
data  collection  becomes  uncomfortably  high,  there  may  be 
stranger  reasons  to  begin  with  a Resolution  III  design,  as  the 
first  block,  and  then  later  add  a second  Resolution  III 
design  to  create  a Resolution  IV  design.  First  of  all, 
blocking  enables  data  to  be  examined  and  factors  added  or 
dropped, or  their  ranges  changed  if  necessary,  after  half  as 
much  data  has  been  collected  as  would  be  required  were  the 
full  Resolution  IV  design  completed  first.  Second,  blocking 
facilitates  the  control  of  certain  irrelevant  sources  of 
variance  (Simon,  1970aj  1974  pp  100-103) . Finally,  running 
experiments  in  small  blocks  reduces  the  chances  that  some 
disruptive  force  would  destroy  the  entire  experiment. 

Equipment  breakdowns  may  be  less  likely  to  occur  and  subject 
sickness  may  be  easier  to  avoid.  In  either  case,  it  is 
easier  to  recoup  from  the  loss  of  a small  block  of  data 
than  it  would  be  if  an  entire  study  were  lost  because  of 
some  disturbance  occurring  part  way  through  the  experiment. 

In  this  report,  only  the  Resolution  IV  plans  will  be 
discussed.  Resolution  III  plus  foldover  plans  were  dis- 
cussed in  an  earlier  report  (Simon,  1973,  pp  89-125). 


Number  of  factors.  The  Resolution  IV  designs  provided 
in  this  report  are  capable  of  handling  up  to  8,  16,  or  32 
factors  (and  others  capable  of  handling  up  to  26,  2 7 ...  2n 
factors  can  be  created  by  the  same  process) . However,  the 
number  of  experimental  factors  that  can  be  studied  in  any 
design  will  be  reduced  if  the  investigator  plans  to  isolate 
trend  effects  or  is  restricted  by  particular  combinations  of 
factors,  interactions,  and  trend  contamination. 

The  investigator  must  allow  for  making  trend  estimates, 
losing  one  column  (or  experimental  factor)  for  each  trend 
(linear,  quadratic,  and/or  cubic)  effect  that  is  to  be  iso- 
lated. Further  restrictions  on  the  number  of  available 
columns  (and  therefore  factors  to  be  studied)  may  occur  if  an 
investigator  wants  to  keep  certain  combinations  of  main  and 
interaction  effects  robust  to  linear  or  quadratic  effects. 

If  he  decides  to  block  his  design,  he  must  sacrifice  still 
more  columns,  which  reduces  the  number  of  factors  that  can 
be  studied  still  more.  For  all  of  these  reasons,  an  investi- 
gator must  select  a design  large  enough  to  handle  more  than 
just  the  number  of  experimental  factors. 

The  Designs 

The  following  basic  Resolution  IV  screening  designs  and 
supporting  data  are  provided  in  this  report: 
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Design 

Number  of  Factors 
to  be  Studied 

Minimum  Number  of 
Observations  (N) 
for  a Single 
Replication 

Design  to  Use 

2®-1* 

IV 

Up  to  eight 

16 

Table  1,  Appendix  I 

f 

o 1 6—1 1 
IV 

Nine  to  16 

32 

Appendix  II 

-.32-26 
*■  IV 

Seventeen  to  31 

64 

Appendix  III 

s 

The  following  information 

is  provided  with 

each  design: 

X • 

The  sign  matrix 

2. 

The  experimental 

conditions 

t 

3. 

The  original  factorial  design  labels 

4. 

The  new  screening 

design  labels 

5. 

Trend-robust  test 

order 

6.  Percentage  overlap  between  linear,  quadratic, 
and  cubic  trend  effects  and  experimental 
design  effects 

7.  Number  of  changes  made  between  levels  for 
each  factor 

8.  Two-factor  interaction  aliases 

9.  Three-factor  aliases  of  main  effects 

10.  Inter-product  sums  used  to  adjust  factors  for 
trend  effects 


Assigning  Factors  to  the  Columns  of  the  Design 

In  assigning  the  real-world  factors  to  the’ columns  of 
the  design  matrix,  the  investigator  will  be  concerned  with 
which  main  effects  and  which  interactions  must  be  kept  trend- 
robust  and  which  will  require  the  fewest  number  of  factor- 
level  changes.  These  decisions,  of  course,  will  depend  on: 
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a)  Which  ones  are  the  most  important  and  thus 
should  be  estimated  with  the  smallest  amount 
of  trend  bias. 
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b)  Which  ones  are  important  but  are  so  unques- 
tionably large  that  they  will  be  identified 
even  though  the  data  is  somewhat  distorted. 

c)  Which  ones  are  the  most  difficult  or  most 
time-consuming  to  change  from  level  to  level. 

d)  Which  ones  are  likely  to  show  large  two- 
factor,  disordinal  interaction  effects. 

Trends.  In  each  table,  the  percentage  overlap  at  the 
bottom  of  the  columns  shows  the  investigator  how  much  each 
column  will  be  contaminated  with  trend  effects.  Columns 
affected  by  linear  trends  are  not  affected  by  quadratic 
trends.  In  making  his  selection,  however,  the  investigator 
should  realize  that  in  human  factors  performance  data,  linear 
effects  are  generally  larger  than  quadratic,  and  both  are 
generally  larger  than  cubic  effects.  Thus,  a 10  % overlap 
for  a linear  effect  would  ordinarily  be  much  more  likely  to 
distort  the  data  than  a 10%  quadratic  overlap.  There  are, 
of  course,  no  absolute  rules  and  the  investigator  is  ob- 
ligated to  minimize  these  effects  by  his  experimental 
procedures  (Simon,  1974,  pp  21-26)  so  that  when  trends  do 
appear,  relative  to  the  effects  under  investigation,  they 
will  be  small  to  begin  with,  making  the  absolute  amount  of 
overlap  even  smaller. 

Special  problems  of  assignment  arise  when  the  investi- 
gator wishes  to  keep  both  main  effects  and  the  two-factor 
interactions  reasonably  trend-free.  There  are  fewer  inter- 
action columns  that  are  trend-free  or  trend-robust,  and  the 
magnitude  of  the  overlap  is,  on  the  average,  higher  than  in  the 
main  effect  columns.  Anything  not  overlapping  more  than 
10*  with  a linear  trend  is  probably  reasonable  to  use. 
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An  overlap  of  less  than  30%  and  50%  between  interactions  and 
quadratic  or  cubic  effects,  respectively,  would  also  probably 
be  acceptable  if  the  investigator  had  no  reason  to  believe 
that  this  type  of  trend  would  be  present  to  any  degree  and 
had  done  his  best  to  reduce  them  through  his  data  collec- 
tion procedures.  These  percentages  are  of  course  arbitrary, 
and  depend  in  part  on  how  cautious  an  investigator  feels  he 
must  be. 

As  the  number  of  factors  increases,  i.e.,  the  larger  the 
designs,  the  options  available  to  an  investigator  in  thiB 
regard,  increase.  Even  if  the  investigator  can't  get  a 
trend-free  interaction  column  with  these  designs,  he  still 
has  two  options.  First,  he  can  make  adjustments  for  trends 
(to  be  discussed  in  the  section  on  Analysis) . Second,  he 
may  modify  the  design  (to  be  discussed  later  in  this  section) . 

Count.  Screening  designs  are  valuable  because  they 
permit  a large  number  of  factors  to  be  investigated 
quickly.  But  if  it  takes  a great  deal  of  time  to  change 
the  factor  levels  from  trial  to  trial,  this  prime  advantage 
will  be  lost.  The  sophisticated  experimenter  — if  he  has 
any  say  ir.  the  matter  — will  see  that  every  means  is  taken 
when  the  experimental  apparati  are  being  built  to  insure 
that  a rapid  and  accurate  change  can  be  made  between  levels 
of  all  factors.  Delays  may  affect  the  subject's  motivation 
and  performance,  and  errors  in  settings  can  destroy  the  value 
of  the  data.  When  normal  precautions  are  taken,  however,  it 
is  more  common  to  find  that  only  a few  of  the  total  number 
of  factors  have  serious  difficulties  insofar  as  changing 
the  factor  levels  is  concerned. 

The  problem  of  assigning  the  factors  to  the  proper 
columns  of  the  design  depends,  therefore,  on  both  the  number  of 
factors  that  must  be  considered  as  well  as  the  degree  of 
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difficulty  in  making  the  factor-level  changes.  For  example, 
if  it  takes  a day  to  make  a change  in  the  level  of  a particu- 
lar factor,  then  the  investigator  would  probably  prefer  to 
assign  the  main  effect  of  that  fsictor  to  a column  requiring 
only  a single  change.  If  it  takes  only  several  minutes,  he 
may  be  content  to  assign  it  to  a column  requiring  more 
changes. 

Unfortunately,  with  the  designs  provided  in  this  report, 
the  main  effects  are  all  associated  with  columns  that  require 
at  least  N/2  or  more  changes,  where  N is  the  number  of  ex- 
perimental conditions  in  the  design.  This  means  that  even 
the  main  effect  column  with  the  smallest  factor-level  change 
count  still  requires  a greavt  many  changes.  Furthermore,  this 
problem  increases  as  the  size  of  the  design  increases. 

The  problem  of  factor  assignment  is  further  complicated 
if  th«  ’investigator  wishes  the  column  selected  for  its 
minimum  number  of  changes  also  to  be  reasonably  robust  to 
trend  effects.  But  it  is  apparent  from  the  designs,  that, 
on  the  average,  those  columns  most  robust  to  trends  are  the 
ones  requiring  the  greatest  number  of  factor-level  changes. 
The  designs,  as  they  have  been  arranged  for  this  paper, 
maximize  this  inverse  relationship.  For  example,  in  the 
21®v11  design  (Appendix  II)  the  column  identified  as  the 
string  containing  the  AB  interaction  is  the  one  requiring 
only  a single  factor-level  change,  but  it  also  is  the  one 
with  the  cubic  trend.  The  column  requiring  only  two  factor- 
level  changes  (i.e.,  the  string  of  two-factor  interactions 
with  AF  in  it)  has  a 71%  overlap  with  the  quadratic  trend,  a 
somewhat  better  situation,  but  not  a comfortable  one.  About 
the  first  reasonable  compromise  in  the  32- run  design  would 
be  the  column  identified  as  the  two- factor  interaction 
string  including  AH,  requiring  four  factor-level  changes  and 
an  overlap  of  only  4%  with  the  quadratic  trend. 
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Thus,,  it  seems  that  with  the  designs  given  in  this 
report,  in  order  to  have  only  a few  factor-level  changes,  a 
main  effect  must  be  assigned  to  one  of  the  columns  made  up 
of  two-factor  interaction  strings.  While  this  is  possib?e, 
since  it  has  already  been  noted  that  we  may  assign  any 
labels  to  the  columns,  it  is  still  not  a simple  matter,  for 
it  triggers  a series  of  reactions  involving  the  other 
columns  in  order  to  maintain  the  appropriate  relationships 
among  main  and  interaction  effects.  However,  there  is  a 
solution  that  an  investigator  may  use  if  necessary.  The 
given  designs  are  intended  to  optimize  the  robustness  to 
trends,  but  if  it  is  also  necessary  to  be  concerned  with 
factor-level  changes  at  the  same  time,  the  designs  can  be 
easily  modified  to  meet  this  need. 

Modifying  the  Given  Designs 

With  the  given  designs,  the  smallest  factor-level  change 
count  for  a main  effect  will  be  equal  to  N/2,  where  N is  the 
number  of  observations  in  the  study,  and  in  the  2®v**  designs, 
no  linear  nor  quadratic  trend  effect  overlaps  a main  effect 
by  10%  or  more.  If  it  is  necessary  to  reduce  the  factor- 
level  count,  by  sacrificing  the  robustness  to  trend, 
one  may  repeat  the  procedures  given  in  this  report  to  create 
the  original  designs  in  Table  1 and  Appendices  II  and  III  ex- 
cept that  instead  of  assigning  to  main  effects  all  of  the  columns 
containing  a Factor  A in  the  original  labels,  we  would  assign 
all  those  containing  Factor  B,  or  Factor  C,  or  Factor  D and 
so  forth  in  the  original  labels,  instead,  depending  on  what 
mixture  of  factor-level  count  and  trend  resistance  is 
required. 

For  example,  with  the  21®v11  design,  if  we  used  all 
columns  originally  labeled  with  Factor  B in  them  for  the 
main  effects,  then  the  smallest  factor-level  change  count 
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associated  with  a main  effect  would  be  four,  but  now  only  one 
out  of  eight  main  effects  is  overlapped  by  linear  or  quad- 
ratic trends  by  more  than  1055  . If  all  columns  labeled  with 
Factor  C in  them  had  been  used  for  the  main  effects,  then 
the  smallest  count  would  be  two  and  only  two  of  the  eight 
factors  would  overlap  linear  and  quadratic  trend  effects  by 
more  than  1055.  At  the  same  time  that  trend-resistance 
among  main  effects  is  decreasing,  more  trend-resistant 
columns  are  being  associated  with  the  two-factor  interaction 
strings.  The  effects  of  building  a 21®"11  screening  design 
where  the  main  effects  are  associated  with  the  columns  of 
the  original  Factors  A,  B,  C,  or  D are  shown  in  Table  3*. 


it 

For  completeness,  the  reader  should  be  aware  of  other 
efforts  to  develop  experimental  plans  that  are  robust  to  trend 
while  minimizing  the  number  of  factor-level  changes  required. 
Simon  (1974,  pp  138-146)  described  the  methods  proposed  by 
Draper  and  Stoneman  (1968)  and  Dickinson  (1974).  Their  plans 
were  limited  in  two  ways:  1)  they  were  robust  only  against 
linear  time  trends  and  2)  their  robustness  was  only  for  main 
effects.  They  arrived  at  what  they  believed  were  optimum 
designs  through  a systematic  examination  of  each  alternative; 
this  becomes  increasingly  expensive  as  the  size  of  the 
design  increases  and  it  also  reduces  experimenter  options. 
Joiner  and  Campbell  (1976)  proposed  to  reduce  the  costs  by 
searching  optimum  combinations  of  a random  subset  of  the 
various  alternatives.  Lancaster  and  Reynolds  (1976)  proposed 
a method  whereby  the  investigator  could  select  the  optimum 
combinations  for  both  main  and  interaction  effects. 
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Once  the  columns  have  been  rearranged  .in  an  order  that 
produces  a satisfactory  factor-level  change  count  and  trend 
resistant  combination,  it  is  necessary  to  assign  the  new 
screening  design  labels.  If  the  design  is  arranged  in  the 
same  manner  described  when  Factor  A terms  were  used,  then  all 
new  screening-design  labels  for  both  main  and  interaction 
effects  and  their  aliases  will  remain  the  same. 

Finally,  the  new  experimental  conditions  must  be  renamed 
because  when  the  columns  have  been  reordered  and  assigned  to 
different  main  effects,  the  order  in  which  the  experimental 
conditions  will  occur  will  also  change.  This;  is  accomplished 
by  merely  writing  down  the  letters  (using  small  letters  for 
the  conditions)  associated  with  all  main  effects  with  a plus 
sign  in  each  row. 

When  fewer  than  the  maximum  possible  number  of  factors 
are  studied.  The  designs  in  Table  1 and  Appendices  I,  II 
and  III  are  suitable  for  investigating  up  to  a maximum  of  8, 
16,  and  32  experimental  factors,  respectively,  less  of  course 
the  number  set  aside  to  handle  trend  or  blocking.  Quite 
often,  however,  an  investigator  will  not  want  to  investigate 
the  maximum  number  possible,  and  will  want  to  modify  the 
given  designs  accordingly.  This  is  done  by  simply  striking 
out  each  letter  representing  the  label  of  each  unused  column 
from  the  letter  designations  of  the  experimental  conditions, 
and  by  removing  all  interactions  in  the  strings  of  aliases 
containing  those  letters.  This  may  create  an  uneven  number 
of  interactions  among  the  strings. 

For  example,  in  Table  1,  if  there  were  only  six  factors 
in  the  experiment  and  no  G and  H factors  were  used,  tnen  the 
experimental  conditions  would  be  changed  as  follows: 


The  columns  in  which  no  main  effects  are  located  are 
now  used  to  estimate  directly  the  effects  of  particular 
strings  of  three-factor  interactions. 
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III.  EFFECTIVE  USE  OF  CENTER  POINTS  IN  SCREENING  DESIGNS 


Unreplicated  2 ^ screening  designs  have  two  distinct 

limitations:  1)  they  cannot  measure  possible  curvilinear  re- 
lations between  independent  and  dependent  quantitative 
variables;  2)  they  provide  no  direct  estimate  of  the  experi- 
mental error  variance.  These  are  recognized,  but  to  obtain 
such  information  would  be  costly  and,  for  screening  purposes, 
would  be  of  little  value  and  certainly  not  cost-effective.  In 
later  stages  of  research,  this  information  does  become  impor- 
tant. Since  the  investigator  can  ordinarily  anticipate  con- 
tinuing his  experiment  beyond  the  screening  phase,  he  would 
fit  a non-linear  model,  if  necessary,  and  obtain  an  external 
error  estimate  at  that  time. 


Data  from  at  least  three  levels  of  each  continuous 
factor  is  needed  to  measure  the  curvature  of  the  response 
surface.  Screening  designs  ordinarily  have  on]y  two  levels. 
Design  points  must  be  replicated  several  times  to  estimate 
error  variance.  Replication  for  this  purpose  is  usually 
discouraged  in  the  screening  study.  However,  once  the 
decision  is  made  to  get  this  information,  it  can  be  obtained 
most  economically  by  adding  data-collection  points  at  the 
center  of  the  experimental  design.  Data  collected  at  the 
center  of  the  design  (with  coded  coordinates  0,0...,  0,  when 
the  original  screening  design  coded  coordinates  were  +1  and 
-1)  will  provide  some  estimate  of  curvilinearity  for  every 
factor . 

By  adding  a single  point  at  the  center  of  the  experi- 
mental desicin,  a third  — middle  --  level  of  every  factor 
of  the  screening  design  is  measured.  This  is  illustrated  in 
Figure  3.  With  three  levels,  -1,  0,  and  +1,  performance  at 
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Three  levels 
for 

Factor  A 


Figure  3 

Illustrating  How  Sinyle  Center  Point  Enables  Each 
Factor  to  be  Tested  at  Three  Levels. 


each  end  point  is  estimated  by  averaging  one  half  of  the 
experimental  conditions  in  the  original  screening  design. 

The  center  position,  however,  would  be  estimated  from 
the  performance  of  only  the  single  center  point.  Because 
of  this  uneven  precision  along  the  dimension,  with  the 
poorest  being  at  the  center  of  interest,  repeated  measures 
should  be  made  at  the  center  of  the  design.  This  center- 
point  replication  will  also  provide  an  empirically  derived 
estimate  of  the  experimental  error. 
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CENTRAL-COMPOSITE  DESIGNS 


Box  and  Hunter  (1958)  propose  using  this  center-point 
replication  technique  in  their  central-composite  designs, 
where  there  are  still  more  advantages  than  indicated  above. 
Since  central-composite  designs  follow  in  the  research 
program  once  the  critical  factors*  have  been  screened,  mul- 
tiple center  points  should  be  included  in  the  screening 
designs  whenever  appropriate.  The  number  of  center-points 
in  central-composite  designs  affect  the  following  design 
characteristics  and  functions: 


1.  The  test  for  the  presence  of  quadratic  effects 
in  the  first-order  model  and  higher-order 
effects  in  the  second-order  model. 


2.  The  estimate  of  "pure"  error  variance  needed 
to  test  the  statistical  significance  of  lack 


of  fit. 


3.  The  uniformity  of  the  "information"  profile 
(which  is  based  on  the  number  of  observa- 
tions at  each  point  in  the  response  surface) . 


4.  The  orthogonality  of  the  central-composite 
design. 


An  optimum  design  strategy  would  use  the  data  collected 
in  the  screening  design  as  a block  of  data  making  up  the  cube 
portion  of  the  central-composite  design.  The  methodology  for 
handling  this  transition  will  not  be  discussed  in  this  report. 
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5.  The  "rotatability"  of  the  central-composite 
design. 

f 

6.  The  ability  to  isolate  block  and  trend  effects. 

As  applied  to  the  central-composite  design,  the  above  items 
I are  discussed  in  considerable  detail  by  Box  and  Hunter 

(1958,  pp  152-168)  and  Simon  (1974,  p 102;  1976a,  pp  22-28). 
Lack-of-fit  tests  can  be  applied  to  screening  designs  sup- 
plemented with  multiple  center  points.  These  will  be 
< discussed  in  the  Analysis  section  of  this  report. 

SCALING 

1 Once  an  experimenter  has  decided  to  add  center  points  to 

the  screening  design,  he  is  forced  for  the  first  time  to  con- 
sider what  measurement  scale  to  use  for  each  factor.  Up  to 
now,  since  basic  screening  designs  are  made  up  with  two  levels 
per  factor,  only  a linear  response  surface  could  be  estimated 
regardless  of  what  shape  might  actually  exist  in  the  real  world. 
Adding  center  points  complicates  the  situation. 

; Let  us,  for  example,  consider  a 2 3 factorial  study 

involving  Sensor  Resolution  (5  and  15  feet),  Target  Bright- 
ness (10  and  100  foot-lamberts) , and  Vehicular  Speed  (300 
and  600  knots) . The  pairs  of  values  set  the  limits  of  the 
three-dimensional  experimental  space.  The  experimenter  who 
decides  to  add  center  points  should  not  automatically 
select  the  point  with  coordinates  in  the  center  of  each 
dimension,  i.e.,  10  feet  resolution,  55  foot-lamberts 
brightness,  and  450  knots  vehicular  speed.  Instead,  he 
should  first  consider  what  scale  will  enable  the  experimental 
space  to  be  represented  by  as  simple  a function  as  possible. 
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To  illustrate  this,  let  us  consider  the  brightness  scale. 
With  a center  point  at  55  foot-lamberts,  experience  has  shown 
that  the  scale  would  relate  non-linearly  with  a visual  per- 
formance task  (Figure  4A) , On  the  other  hand,  when  brightness 
data  is  plotted  on  a logarithmic  scale,  the  relation  would  more 
nearly  approximate  a straight  line  (Figure  4 B ) . 


-l  o +1 


Figure  4. 

Plotting  Brightness  on  Linear  and  Logarithmic  Scales 


Since  economical  multifactor  research  is  most  successful 
the  simpler  the  relationship  and  since  fewer  conditions  need 
be  studied  to  approximate  the  less  complex  functions,  the 
experimenter  would  be  better  off  using  a log  foot-lambert 
scale  while  maintaining  the  range  between  10  and  100  foot- 
lamberts  (i.e.,  one  and  two  log  foot-lamberts).  This  means 
that  the  center  point  on  that  scale  would  be  at  1.5  log  foot- 
lamberts,  or  31.6  foot-lamberts  instead  of  55  foot-lamberts. 
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A similar  decision  must  be  made  for  the  Vehicular  Speed.  j'| 
The  experimenter  would  want  to  consider  whether  speed  or  i 
rate  (the  reciprocal  of  one  another)  is  likely  to  give  the  | 
simplest  function.  The  choice  will  determine  whether  450  J 
knots  or  its  reciprocal  in  seconds  would  be  used.  | 


QUALITATIVE  FACTORS 

Center  points  can  be  added  to  a design  only  when  the 
factors  are  quantitative  and  continuous.  Categorical  varia- 
bles have  no  order  and  therefore  no  center  values.  However, 
when  quantitative  and  qualitative  variables  are  studied  in 
the  same  experiment,  center  points  can  still  be  added.  In 
that  case,  the  condition  would  be  centered  only  within  the 
space  defined  by  the  quantitative,  continuous  variables. 

This  restricted  center  point  would  be  replicated  once  for 
each  unique  combination  of  the  qualitative  variables. 

This  is  illustrated  in  Table  4.  A sign  matrix  is  given 
for  an  experiment  with  two  qualitative  and  two  quantitative 
factors.  The  first  sixteen  conditions  are  those  of  a 
full  2 4 factorial,  with  + and  - representing  the  coded  +1 
and  -1,  high  and  low  values.  The  last  four  conditions  show 
how  center  points  (0,0  in  the  coded  terms)  for  the  quantita- 
tive variables  are  added. 


TABLE  4 


CENTER  POINTS  IN  AN  EXPERIMENTAL  DESIGN 
INVOLVING  QUANTITATIVE  AND  QUALITATIVE  FACTORS 


FACTORS 


Qualitative 

Quantitative 

Experimental 

Conditions* 

I 

II 

III 

IVs 

1 

2 

+ 

_ 

*■» 

3 

- 

+ 

- 

- 

4 

+ 

+ 

- 

- 

5 

- 

- 

+ 

- 

6 

+ 

- 

+ 

- 

7 

- 

+ 

+ 

- 

Original 

8 

+ 

+ 

+ 

- 

\ factorial 
design 

9 

- 

- 

- 

+ 

10 

+ 

- 

- 

+ 

11 

- 

+ 

~ 

+ 

12 

+ 

+ 

- 

+ 

13 

- 

- 

+ 

+ 

14 

+ 

- 

+ 

+ 

15 

- 

+ 

+ 

+ 

16 

+ 

+ 

+ 

+ 

y 

17 

- 

- 

0 

0 N 

18 

+ 

- 

0 

0 , 

Minimum  number 

19 

- 

+ 

0 

0 ' 

' of  additional 
center  points 

20 

+ 

+ 

0 

0 

/ 

* 

This  is  not  intended  to  be  an  optimized  presentation  order 
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Subjects  in  psychological  experiments  either  appear  as  f 


1)  identifiable  types  who  can  be  represented  as  composite 


i 

i 


levels  of  subject  factors, or  as  2)  unidentified  masses, 
presumed  to  be  homogeneous  members  of  the  same  population. 

SUBJECT  CHARACTERISTICS  AS  EXPERIMENTAL  FACTORS 

There  are  two  situations  that  can  exist  when  we  wish  to 
include  subject  characteristics  as  factors  along  with  equip- 
ment/environment factors  and  temporal  factors.  In  one,  each 
subject  is  selected  having  the  characteristics  required  by 
the  sign  matrix.  In  the  other,  measurable  subject  character- 
istics are  known  but  it  is  difficult  to  impossible  to  select 
subjects  with  the  required  combinations. 


Measuring  Subject  Characteristics  as  Part  of  the  Design 


If  each  subject  characteristic  were  to  be  investigated 
at  two  levels,  and  there  are  f characteristics,  2f  subjects 
would  be  required  to  exhibit  all  of  the  required  combina- 
tions of  characteristics.  Each  subject  would  be  tested  on 
a particular  combination  of  the  remaining  factor  levels, 
where  the  combined  characteristics  of  subjects  and  other 
factors  would  represent  a specific  experimental  condition 
as  defined  by  the  sign  matrix. 


A study  on  target  acquisition  performed  at  the  Nava.?. 
Weapons  Center,  China  Lake,  California  (Grossman  and  White- 
hurst, 1976)  illustrates  how  subject  characteristics  can  and 
should  be  introduced  into  the  experimental  design  of  the 
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screening  study.  Three  of  eleven  factors  in  that  study  were 
subject  factors.*  This  required  a minimum  of  23  = 8 subjects, 
each  having  the  appropriate  combination  of  characteristics 
as  indicated  in  the  following  sign  matrix: 


Subject  # Acuity (A)  Depth  Perception  (B)  Color  Vision (C) 


1 

2 + 

3 

4 + 

5 

6 + 

7 

8 + 


+ 

+ 

+ 

+ 


+ 

+ 

+ 

+ 


Where  - represents  the  poor  condition  and  + represents  the 
good,  according  to  specified  criteria. 

Each  subject  was  tested  under  appropriate  combinations 
of  the  eleven  equipment/environment  factors  required  to 
complete  the  16  experimental  conditions  of  the  complete 
211"7  Resolution  III  design.  For  example: 


1 

,4 

I 

s 


% 


I 

1 


* 

A fourth,  labeled  Experience  (D)  might  be  considered 
a subject  characteristic  but  was  introduced  into  this  ex- 
perimental design  as  a temporal  factor.  Each  subject  ran 
through  the  experiment  twice.  The  first  measurement  of  each 
condition  was  considered  low  experience  and  the  second 
measurement  of  each  was  considered  high  experience. 


) 

s 

* 

l 

\ 

I 

I 

! 

% 
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Although  a number  of  subjects  are  involved,  each 
experimental  condition  is  represented  only  once  in  an 
unreplicated  design.  If  one  considers  subject  factors 
equally  as  important  as  equipment  factors, then  no  distinc- 
tion need  be  made  in  the  analysis  of  the  data.  If  the 
purpose  is  to  order  the  factors  whatever  the  source,  accord- 
ing to  their  relative  effects  on  the  performance  of  the  task 
under  investigation,  then  this  screening  design  can  be  used. 


! i 


Measuring  Subject  Characteristics  Not  in  the  Design 

When  it  is  not  possible  to  vary  subject  parameters  by 
systematically  selecting  a subject  with  precisely  the  correct 
combination  of  characteristics,  then  measurements  should  be 
made  of  the  characteristics  as  they  actually  exist  in  the 
subjects  who  are  used.  If  over  the  entire  experiment  the 
variables  tend  to  distribute  themselves  relatively  normally, 
then  their  effects  can  be  estimated  along  with  the  more  sys- 
tematic ones  using  a regression  analysis.  One  can  visualize 
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the  variables  laid  out  as  terms  of  a polynomial  to  estimate 

A 

performance,  y: 

A 

Y “ Hi  A + Ba  D + Bs  c *■  0>*D  + BsE  + Hr. I’  . . . . 

where  the  italicized  letters  represent  measured  values  of  the 
uncontrolled  variables  (probably  correlated  amour  one  another 
and  the  other  variables)  while  the  Roman  letters  represent 
selected  levels  of  the  controlled  factors  of  the  factorial 
(or  fractional  factorial).  The  are  the  weights  of  each 
variable  as  determined  by  a regression  analysis,  preferably 
ridge  regression  analysis  (Simon,  1975)  . As  the  correlation 
among  variables  increases,  ridge  regression  analysis  is 
superior  to  the  conventional  multiple  regression  analysis  for 
this  purpose.  However,  when  uncontrolled  variables  are  to 
be  measured  and  analyzed  along  with  the  controlled  variables 
in  the  experimental  desiqn,  enough  extra  observations  must  be 
made  to  provide  the  degrees  of  freedom  needed  to  cover  the 
additional  uncontrolled  variables.*  These  degrees  of  freedom 
may  be  obtained  if  the  basic  design  is  replicated  using  a 
representative  sample  oi  different  subjects  selected  in 
some  random  manner.  The  required  number  of  degrees 
of  freedom  may  also  be  obtained  if  the  orthogonal 
design  is  analyzed  first  in  the  prescribed  manner,  and  those 
factors  that  are  definitely  trivial  are  dropped  from  the 
analysis.  Presuming  tho  number  of  factors  dropped  is  equal 


* 

Use  of  this  technique  need  not  be  limited  to  uncon- 
trolled subject  variables,  but  can  be  applied  for  any  type 
of  uncontrolled  variable. 
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or  greater  than  the  number  of  measured  variables,  then  there 
will  be  enough  degrees  of  freedom  available  to  re -analyze 
data  to  include  the  uncontrolled  but  measured  variables 
(co-variables) . While  there  are  some  dangers  associated  with 
this  latter  procedure,  an  alert  investigator  should  be  able 
to  detect  them  if  they  arise.  The  odds  favor  the  latter 
approach  which  maintains  the  integrity  of  economy  in  a 
screening  study. 


SUBJECTS  AS  REPLICATION 

Replication  is  the  antithesis  of  experimental  economy. 

In  some  cases,  it  is  used  unreasonably.  Such  is  the  case 
when  an  investigator  replicates  a fractional  factorial  design. 
If  he  intends  to  expend  this  additional  effort  collecting 
more  data,  it  would  be  far  more  informative  to  add  a dif- 
ferent fraction  to  the  design  than  it  would  be  to  replicate 
the  original  fraction.  In  this  way,  more  sources  of  variance 
in  aliased  strings  could  be  isolated,  increasing  the  inves- 
tigator's understanding  of  the  situation.  As  Daniel  (1976, 
p 10)  says:  "The  most  useful  replication  will  be  that  which 
best  samples  the  population  of  conditions  about  which  E wants 
to  make  inferences.  In  this  sense,  the  best  replication  is 
done  under  different  conditions,  not  under  the  same  condi- 
tions." Simon  (1973,  pp  19-32)  reviewed  the  arguments  psy- 
chologists frequently  give  for  replicating,  and  indicated 
their  weaknesses  and  alternative  solutions. 

Two  valid  reasons  for  "replicating"  with  subjects,  after 
all  other  alternatives  have  been  exhausted,  are  to  establish 
inter-subject  reliability  and  to  obtain  confidence  intervals 
or  fiducial  limits. 
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Replicating  for  Inter-subject  Reliability 


An  investigator  never  really  knows  if  there  are  unwanted 
and  unknown  sources  of  va’  j.ance  affecting  his  experimental 
data.  No  matter  how  careful  he  may  be  — and  there  appears 
to  be  large  investigator  differences  in  the  care  with  which 
they  collect  experimental  data  (Simon,  1976b)  — an  investi- 
gator should  impose  checks  on  the  quality  and  consistency  of 
his  data.  This  means  that  when  a second  subject  is  tested  on 
all  the  experimental  conditions,  the  data  from  each  subject 
should  be  analyzed  separately  and  compared.  This  not  only 
permits  a check  on  the  consistency  of  responses  among  homo- 
geneous subjects  as  well  as  the  assumption  of  homogeneity, 
but  also  helps  detect  distortions  and  outliers  in  the  data. 
Some  hints  in  this  regard  are  discussed  in  the  section  on 
Analysis.  The  investigator  may  even  wish  to  test  more 
subjects  (still  making  individual  examinations  of  the  results) 
until  he  builds  confidence  in  a particular  set  of  conclusions 
or  discovers  reasons  for  not  accepting  them. 


While  methods  of  isolating  experimental  from  trend 
effects  in  screening  studies  have  been  described,  an  investi- 
gator may  be  as  concerned  with  cross-over  effects  as  he  is 
with  trends.  If  so,  he  may  decide  to  present  the  experimental 
conditions  to  several  subjects  in  different  orders  in  a way 
which  will  enable  cross-over  effects  to  be  isolated  from  ex- 
perimental effects  (Simon,  1974,  pp  27-90)* 


Economical  designs  that  are  robust  both  to  trend  and 
cross-over  effects  have  not  been  worked  out. 
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Replicating  to  Establish  Confidence  Intervals 


The  appropriate  research  strategy  is  to  establish 
confidence  intervals  at  the  end  of  the  experimental  program. 
Once  an  equation  containing  all  of  the  critical  factors  has 
been  derived,  those  combinations  of  factor  values  that 
optimize  performance  or  represent  combinations  of  practical 
interest  would  be  used  to  test  a group  of  "truly"  homogeneous 
subjects.  Subjects  can  be  considered  homogeneous  after  the 
inveecigator  has  separated  them  into  groups  on  the  basis  of 
critical  subject  variables  and  any  remaining  within-group 
subject  variability  is  small  and  not  readily  identifiable. 
It's  the  "what's  left  over"  after  all  efforts  to  identify 
the  sources  have  been  exhausted.  Generally,  establishing 
confidence  intervals  would  be  done  in  the  operational  environ 
ment  where  that  information  would  be  most  useful. 


D A T A ANALYSIS 


• How  to  calculate  the  criteria  for  deciding  which 

factors  are  critical  to  the  task  under  inves- 
tigation and  which  are  marginal  or  trivial: 
effects,  eta  squared,  cumulative  proportion 
of  variance,  half-normal  plots. 


• How  to  analyze  subjects  used  to  replicate  the 
basic  screening  design. 


• How  to  adjust  experimental  effects  for  trends. 


• How  to  analyze  multiple  responses:  graphical 
and  statistical  methods. 


• How  to  evaluate  how  well  first  and  second  order 
regression  equations  fit  the  empirical  data. 


• How  to  analyze  an  incomplete  screening  design. 
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V.  CALCULATING  CRITERIA  TO  SELECT  NON-TRIVIAL  FACTORS 


Since  the  purpose  of  the  screening  study  is  to  identify 
those  factors  out  of  a larger  candidate  group  which  have  non- 
trivial effects  on  performance,  the  first  step  of  the  analy- 
sis is  to  calculate  a number  of  criteria  which  will  help  the 
experimenter  make  that  judgment.  It  is  appropriate  at  this 
time,  before  the  analysis  begins,  to  emphasize  the  point  that 
there  are  no  mechanical  methods  of  selecting  the  trivial  and 
non-trivial  factors.  Lest  the  unsophisticated  investigator 
believe  that  requiring  subjective  decisions  on  the  part  of 
the  investigator  is  unscientific  and  is  a weakness  confined 
to  these  screening  studies,  let  him  be  assured  that  this  is 
not  the  case.  Evaluating  the  results  from  a screening 
designs  study  is  no  different  from  evaluating  the  results 
from  an  analysis  of  variance  by  hypothesis  testing.  Accept- 
ing or  rejecting  the  hypothesis  is  done  by  the  investigator, 
not  the  F-test  (Bakan,  1967) . Statistics  applied  to  the 
empirical  data  may  facilitate  a decision. 

SELECTION  CRITERIA 

Whether  or  not  a factor  is  considered  non-trivial  will 
be  based  on  the  following  criteria: 

1 . Does  it  have  a practical  effect  on  performance? 

This  can  be  determined  by  calculating  its  effect, 
i.e.,  the  mean  difference  between  the  high  and 
the  low  value  of  that  factor. 

Precaution:  If  the  pair  of  values  per  factor  in 

the  experimental  design  does  not  cover  the  full 
range  of  interest,  an  estimated  effect  will  not 
be  indicative  of  the  full  strength  of  this 
factor. 
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2 . Does  the  factor  account  for  a meaningful  pro- 
portion of  the  variance  in  the  experiment? 

This  is  determined  by  calculating  Eta  squared, 
or  the  ratio  of  the  sum  of  squares  for  the 
factor  to  the  total  sum  of  squares. 

Precaution:  If  the  candidate  list  does  not 
include  essentially  all  of  the  critical  factors 
affecting  performance  under  operational 
conditions,  then  proportions  obtained  in  the 
experiment  will  be  deflated  when  applied  to 
a real-world  problem. 

3.  Does  including  the  factor  materially  improve 
the  ability  to  predict  performance  under 
operational  conditions? 

This  is  determined  by  examining  the  cumula- 
tive proportion  of  variance  obtained  when 
the  effects  of  the  factors  are  combined. 

Precaution:  If  an  effect  is  due  to  chance  in 

this  sample,  including  it  will  lead  to  poorer 
prediction  in  subsequent  tests  (shrinkage) . 

4 . Could  the  observed  effect  have  been  due  to  chance? 
Without  a source  of  error  variance,  the  investi- 
gator must  rely  on  less  direct  indications  (i.e,, 
internal  tests)  of  a chance  phenomenon.  Examining 
the  data  using  "half-normal  plots"  may  be  useful 
for  this  purpose. 

Precaution:  This  graphic  inspection  of  portions 
of  the  data  is  still  a poorly  developed  art. 
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5.  Can  the  cumulative  effects  of  a large  number  of 
non-critical  factors  be  ignored? 

While  some  factors  may  show  only  small  effects, 
nevertheless,  they  have  an  impact  on  performance. 

If  there  is  a large  number  of  marginal  factors, 
and  according  to  the  principle  of  maldistribu- 
tion that  is  what  we  expect,  we  may  wish  to 
exclude  them  during  an  initial  screening,  but  to 
examine  them  more  carefully  during  the  refine- 
ment phase  of  the  program.  Together  they  may 
improve  prediction  considerably. 

In  applying  the  above  criteria,  the  investigator  will  temper 
his  judgment  with  the  cost  of  each  decision,  as  well  as  by 
satisfying  the  interests  of  those  who  have  sponsored  the 
research.  With  an  iterative  research  strategy,  and  decisions 
are  constantly  being  tested,  no  decision  need  be  final.  Factors 
included  or  excluded  early  in  the  program  may  be  excluded  or 
included  later  in  the  program,  if  necessary.  Generally 
the  error  in  decision  will  occur  with  the  marginal  factors 
where  the  practical  effect  of  an  error  is  the  smallest. 

Estimating  the  Effects 

The  "effect"  of  a factor  is  the  mean  difference  between 
the  performances  measured  on  the  two  levels  or  conditions  of 
that  factor.*  An  investigator  has  several  methods  at  his 
disposal  for  estimating  effects  in  the  2 y screening  design. 


Some  statisticians  use  the  term 
of  "effect." 


"contrast, " 


instead 
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The  conventional  method  (for  psychologists)  of  finding 
the  effect  of  Factor  A,  for  example,  in  a 2^”^  design,  would 
be  to  add  up  all  of  the  performance  values  in  all  the  cells 
associated  with  one  condition  of  Factor  A and  to  add  up  all 
of  the  performance  values  associated  with  the  other  condition 
of  Factor  A.  The  means  of  these  sums  would  represent  the 
mean  performance  on  the  two  conditions  and  the  difference 
between  the  two  means  is  referred  to  as  the  "effect."  This 
is  illustrated  with  some  fictitious  data  in  Table  5. 

As  more  factors  are  included  in  the  screening  experiment, 
the  sign  matrix  can  be  used  to  facilitate  the  analysis.  The 
sign  matrix  for  a 22*1  design,  along  with  fictitious  perfor- 
mance data,  is  used  to  show  how  a sign  matrix  is  used  (Table 
S) . For  example,  the  effect  of  Factor  A is  estimated  by 
summing  all  performance  scores,  obtained  when  the  "high"  (+) 
condition  of  A is  being  tested,  e.g.: 

4 + (-5)  +3  + 5 = 7 

This  sum  is  divided  by  4,  giving  the  mean  performance  of  1.75 
for  the  high  condition.  Next,  all  performance  scores,  ob- 
tained when  the  "low"  (-)  condition  of  A was  being  tested, 
are  summed , e.g.: 


2 + 3 + 1 + (-2)  = 4 

This  sum  is  divided  by  4,  giving  a mean  performance  value  of 
1.00  for  the  low  condition.  Subtracting  the  low  from  the 
high  gives  a mean  difference  of  0.75,  the  effect  of  Factor  A. 

Similarly,  the  effect  of  the  interaction  AB  would  be 
obtained  as  follows: 

(+4  + 3 + 1 + 5)  - (+2  -5  +3  -2)  = 15  M = 3.75 
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The  divisors  for  the  means  in  these  cases  always  equal  half 
the  total  number  of  performance  values.  Each  column  of  the 
matrix,  i.e.,  each  source  of  variance,  is  treated  in  the  same 
way. 


Yates'  algorithm.  When  the  effects  of  a large  number 
of  factors  must  be  estimated,  using  the  sign  matrix  can 
become  tedious  and  the  chance  of  making  arithmetic  errors 
increases  it  a computer  is  not  used.*  Yates  (1937)  devel- 
oped a systematic  tabular  method  of  calculating  the  effects 

k k_n 

of  2 designs  which  is  adaptable  to  2 ^ designs  including 

screening  designs.  An  example  of  the  analysis  of  a 23  design 
using  Yates'  algorithm  is  given  in  Table  7.  The  steps  are 
these: 

k 

1.  List  the  2 experimental  conditions  in  the  Standard 
Order  (Column  I).  This  Standard  Order  is  (1),  a,  b, 
ab,  c,  ac,  be,  abc,  d,  ad,  bd,  abd,  cd,  and  so  forth, 
where  after  the  (1)  condition,  a factor  at  a time 

is  added,  followed  by  all  interactions  between  that 
factor  and  each  previous  factor  combination  before 
a new  factor  is  added. 

2.  List  each  performance  score  adjacent  to  the  corres- 
ponding conditions  (Column  II) . If  it  will  simplify 


★ 

Even  if  the  calculations  are  done  with  a computer, 
there  is  a material  advantage  in  using  Yates'  algorithm. 
Because  the  Standard  Order  is  assumed  (or  corrected  for  later 
if  the  initial  assumption  is  incorrect),  the  only  inputs  to  . 
the  computer  are  the  performance  scores.  No  2^  or  matrix 

need  be  input,  a savings  in  the  programming  and  card  punching 
requirements. 
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TABLE  7 

YATES'  ALGORITHM  FOR  ANALYZING  A 2k  FACTORIAL 


I 

Standard 

Order 

II 

Experimental 

Condition 

Performance 

1 

2 

Effect- 

total 

3 

III 

Effect 
r (N/2) 

IV 

Source 

1) 

(1) 

4 

6 

4 

11 

2.75 

Mean  x 2 

2) 

a 

2 

-2 

7 

-3 

-.75 

A 

3) 

b 

-5 

3 

6 

-7 

-1.75 

B 

4) 

ab 

3 

4 

-9 

15 

3.75 

AB 

5) 

ab 

5 

-2 

-8 

3 

.75 

C 

(') 

b 

-2 

8 

1 

-15 

-3.75 

AC 

7) 

a 

3 

-7 

10 

9 

2.25 

BC 

8) 

(1) 

1 

-2 

C 

-5 

-1.25 

ABC 

i 

i 


•A 

•* 


I 

■3 


I 

I 


.i 


’I 

1 

1 
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calculations,  a constant  can  be  subtracted  from 
every  score  without  affecting  the  estimation  of  the 
effects.  Only  the  mean  must  be  corrected  by  that 
constant  amount. 

3.  Separate  the  numbers  in  Column  II  into  pairs  and 
add  the  two  values  in  each  pair,  taking  signs  into 
consideration.  List  these  sums  in  order  in  the 
upper  half  of  Column  1. 

Next,  start  again  at  the  top  and  subtract  the  FIRST 
number  from  the  second  of  each  pair  in  Column  B and 
list  the  differences  in  order  in  the  lower  half  of 
Column  1. 

4.  Repeat  this  process  to  create  Column  2 using  the 
numbers  in  Column  1. 

5.  Continue  to  add  and  subtract  adjacent  pairs  in  each 
list  to  create  a new  list  until  there  is  a total 

of  k numbered  columns  for  2 experimental  conditions. 

In  the  example  in  Table  7,  with  8 = 2 3 conditions, 
there  are  three  numbered  columns  (Columns  1,  2,  and  3). 

6.  The  effect  for  each  factor  is  obtained  by  dividing 
the  appropriate  value  in  the  last  numbered  column, 
referred  to  as  the  "effect- total , "*  by  a value  equal 
to  half  the  total  number  of  observations  in  the 
experiment  (Column  III) . 


Sometimes  called  the  "contrast-sum. 


The  effects  thus  calculated  will  also  bo  listed  in  the  Stan- 
dard Older,  the  first  value  being  equal  to  twi ce  the  mean*  of 
all  tho  data,  the  second  being  the  effect  of  Factor  A,  the 
third  being  the  effect  of  Factor  B,  tho  AB,  C,  AC,  BC,  ABC, 
t),  AO,  and  so  forth  (Column  IV)  . 


When  this  analysis  is  used  with  screening  (or  other 
fractional  factorial)  designs  .in  which  the  original  factorial 
labels  are  changed  to  new  screening  design  labels,  and 
effects  are  aliased,  an  equivalent  change  must  be  made  in 
the  factor  labels  of  the  analysis  using  Yates'  algorithm. 
Corresponding  now  labels  must  be  substituted  for  tho  o.ld 
labels  that  appear  in  the  Standard  Order  in  the  effects  column. 


Daniel  (1956,  p 93)  writes:  "With  N„  as  large  as  32, 
Yates'  computational  form  may  bo  split  into  two  forms  of 
size  16,  using  sum  and  differences  of  pairs  over  the  last 
factor,  instead  of  the  original  single  results.  This  sub- 
division may  be  continued  further  for  NR  larger  than  32." 

This  is  illustrated  in  Table  8.  The  performance  values 
(Column  B)  of  the  experimental  conditions  (Column  A)  listed 
in  Standard  Order  would  bo  divided  in  half,  with  performances 
associated  with  all  low  conditions  of  the  last  factor  (i.o., 

C in  this  example)  being  analyzed  with  Yates'  algorithm  as 


one  problem  and  performance  associated  with  all 
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ditions  of  che  last  factor  analyzed  as  a separate  problem. 


Since  only  half  the  data  is  in  each  problem  there  will  bo 


one  less  column  in  each  sub-analysis  (Columns  1 and  2)  than 


would  bo  in  tho  full  analysis.  When  the  effect/total  values 


* 

We  divided  by  N/2  in  Step 


would  of  course  divide  by  N. 


(>. 


To  get  tho  mean,  we 


TABLE  8 


USING  A SPLIT-YATES'  ALGORITHM 
TO  ANALYZE  A 2K  DESIGN 


1 

II 

1 

2 

2' 

2" 

3 (-: 

4 i 111 

Half  r 

(1) 

a 

a+b 

a+b+c+d  = 

i.4 

/I 

4 t-K 

«•» 

with  i 

a 

b 

c+d 

b-a+d-c 

: = 

(B 

/■: 

B+F 

A 

O [ 

b 

c 

b-a 

c+d-a-b  = 

(C 

B 

C+G 

B 

ab 

d 

d-c 

d-c-b+a  = 

(D 

F 

n+n 

AB 

Half  r- 

c 

e 

e+f 

e+f+g+h  = 

(V 

C 

t:-A 

C 

with  I 

ac 

f 

g+h 

f-e+h-c 

1 = 

(F 

a 

F-B 

AC 

Of 

be 

g 

f-e 

g+h-e-f 

: = 

u; 

n 

{i-C 

BC 

1 

abc 

h 

h-g 

h-g-f+e  = 

(n 

i! 

H-F 

ABC 

Original 

Performance 

n 

Applying 

i 

Effect- 

Effect- 

Overall 

Comb i ned 

exptl . 

values 

Yates’ 

totals 

t ot  als 

effect- 

effects 

conditions 

algor ithm 

I for 

each 

of  each 

tot  als 

in 

in 

(symbolic) 

separately 

half 

of 

half 

for 

Standard 

Standard 

to  each 

total 

after 

complete 

Order 

Order 

half 

split 

«Ape 

r. 

pairing 

ay opr . 

on  low 

level  and 

i 

high 

level 

of  factor  C 

l 

II 

L_i 

2 

2 

1 

2" 

j 

11 1 
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are  obtained  for  each  half  (Column  2')  they  would  be  inter- 
mingled, alternating  with  the  first  effect-total  in  the  low- 
condition  analysis  followed  by  the  first  effect-total  in  the 
high-condition  analysis,  and  continued  to  alternate  in  this 
fashion  until  the  two  halves  are  completely  paired  (Column  2"). 
This  new  column  is  then  treated  to  the  sum-difference  analysis 
as  if  it  had  been  the  next  to  last  column,  the  effect-total 
values,  of  the  full  analysis  (Column  3).  Then  in  this  example, 
if  Column  3 is  divided  by  N/2  = 4 , we  obtain  the  mean  doubled 
and  the  effects  in  Standard  Order, 

Estimating  the  coefficients  of  the  multiple  regression 
equation.  In  screening  designs,  the  equation  would  take  the 
form: 


V = b X + b X,  ...  b,  X.  + b . X . + b X . . . + b . X . 
oo  aa  kk  all  all  ac^  acj  ab  ab 


where 


Y = estimated  performance 


b^  = coefficient  for  factor  i. , where  i = a to  k 


Xi  = term  representing  the  level  of  factor  ij  XQ  = 1 

b.  . = coefficient  for  the  string  of  two-factor  intar- 
— 1 actions 

X . . = term  representing  the  string  of  two-factor  inter- 
^-3-  actions 


The  regression  coefficient,  b- , equals  XYX./XX:.  However 

* IX 

in  the  basic  screening  design,  >:xt  = N and  XYX.^  equals  the 
effect-total  in  the  Yates'  algorithm.  Thus  the  regression 
coefficient  for  a multiple  regression  equation  can  be  ob- 
tained as  follows: 


Effect-total  Effect 


-S  " - ' '-^-iv ,7>^w 
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Interpreting  effects  data.  The  effects  data  shows  the 
change  in  performance  that  occurs  between  the  two  levels  of 
each  factor.  If  these  levels  represent  the  extremes  of  the 
operational  Bpace,  or  the  upper  and  lower  limits  of  perfor- 
mance, then  the  magnitude  of  the  effects  tells  us  something 
of  the  practical  importance  of  that  factor  for  the  task  under 
consideration.  Thus,  it  is  not  possible  to  make  a meaningful 
interpretation  of  the  results  without  fully  understanding  the 
design  and  its  context  in  the  real  world.  In  Figure  5 
(solid  line) , the  effects  of  resolution  might  be  quite  differ- 
ent depending  on  which  two  levels  had  been  selected  for  the 
conditions  of  the  experiment: 


Trivial ; 
Mild: 
Larqe : 


AB,  DE 

CD 

BD 


Figure  5.  Illustration  of  How  Experimental  Context 

(i.e.,  task  difficulty)  Affects  Performance 


All  this  could  change  as  a function  of  other  parameters.  For 
example,  the  effect  between  levels  B and  C might  have  been 
trivial  if  all  targets  had  been  so  large  that  differences  in 
resolution  were  inconsequential.  This  is  illustrated  by  the 
dotted  line. 
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When  effects  are  evaluated,  however,  the  interpretation 
must  be  made  in  the  context  of  the  operational  situation 
rather  than  the  experiment.  It  is  equally  important  that  the 
performance  also  be  measured  in  terms  of  operationally  rele- 
vant parameters.  For  example,  a 2.4-second  difference  in  the 
speed  of  reading  a full-size  newspaper  page  would  probably 
not  be  an  important  consideration  in  selecting  one  of  two 
styles  of  type.  On  the  other  hand,  a 2.4-second  difference 
might  be  quite  critical  in  selecting  the  design  of  a safety 
switch  on  a nuclear  reactor.  The  experimenter  must  look  at 
the  effect  and  decide  if  one  that  size  is  critical  in  the  per- 
formance of  the  real-world  task.  f it  is  definitely  not,  then 
that  source  of  variance  can  be  excluded  until  new  evidence 
negates  that  decision.  If  it  is  a marginal  effect,  other 
considerations  involving  costs  and  convenience  will  determine 
whether  it  should  be  excluded  at  this  time  or  not.  If  the 
effect  being  considered  represents  the  sum  of  a string  of  two- 
factor  or  three- factor  interactions,  the  investigator  should 
determine  whether  or  not  any  of  the  larger  main  effects  are 
paired  in  the  string.  If  so,  it  is  likely  (though 
definitely  not  certain)  that  the  string  represents  an  ordinal 
interaction  which  is  of  secondary  importance.  Deciding 
whether  a string  contains  an  ordinal  or  disordinal  interaction 
may  require  more  data  to  be  collected  (Simon,  1973,  p 116-124). 


Estimating  the  Proportion  of  Variance  Accounted  For 


k o • * 

For  an  unreplicated  2 K screening  design,  the  variance, 

or  mean  square,  can  be  calculated  quite  simply  once  the 


effects  for  each  source  have  been  obtained. 


First  the  Sum  of  Squares  for  each  individual  source  of 


variance  is  calculated  as  follows: 


_ r N (Effect) 

Sum  of  Squares  j 


where  N is  equal  to  the  total  number  of  observations  in  the 
experiment.  Since  these  designs  involve  factors  at  only  two 
levels,  each  source  has  only  one  degree  of  freedom.  Therefore, 
the  sum  of  squares  and  variance  for  each,  effect  are  equal. 

Eta  squared.  The  proportion  of  total  performance 
variance  in  the  experiment,  accounted  for  by  each  source  of 
variance,  is  calculated  as  follows: 

Eta  souared  = Sum  °f  squares  for  particular  source 

Total  sum  of  squares 

Total  sum  of  squares  is  obtained  by  summing  the  sums  of 
squares  for  all  N-l  sources  of  variance,  including  those 
between  blocks,  if  any,  in  the  experiment.  The  mean  is  not 
included. 


Interpreting  proportion  of  variance  data.  In  interpret- 
ing the  proportion  of  variance  associated  with  a single  source 
of  variance,  two  things  must  be  remembered:  one,  it  is  a 
relative  measure  and  two,  its  importance  depends  on  how  many 
critical  factors  are  included  in  the  experiment.  As  a rela- 
tive measure,  the  magnitude  of  an  eta  squared  depends  on  the 
magnitude  of  the  other  sources  of  variance  in  an  experiment. 
Since  there  is  always  an  upper  limit  of  1.00  on  the  proportion 
of  total  variance  that  can  be  accounted  for,  a source  that 
shows  a mean  difference  of  30  seconds  may,  for  example, 
account  for  25%  or  75%  of  the  total  variance  depending  on 
whether  the  other  effects  and  error  in  the  experiment  are 
relatively  large  or  small,  enhancing  or  decreasing  the  abso- 
lute total  variance,  and  changing  the  relative  proportion 
accounted  for  by  any  single  effect.  With  only  one  factor 
plus  some  random  error  variance,  a factor  may  account  for  90% 
or  10%  of  the  total  variance  depending  whether  or  not  it  is 
a "clean"  experiment  with  a little  or  a lot  of  random  error 
respectively.  Thus,  in  interpreting  eta  squared,  a source 
that  accounts  for  a small  proportion  of  variance  is  likely  to 
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be  a non-critical  source  of  variance,  but  a source  that 
accounts  for  a large  proportion  of  variance  cannot  per  se 
be  considered  critical.  It  may  have  accounted  for  most  of 
the  performance  variability  in  the  experiment,  but  in  the 
real  world  where  a great  many  factors  are  likely  to  be  opera- 
ting, it  will  account  for  relatively  little.  It  is  the  case 
of  a big  frog  in  a small  pond. 


The  only  time  when  a source  with  a large  proportion-of- 
variance  value  can  be  considered  critical  — with  confidence 
is  when  a large  number  of  factors  has  been  included  in  the 
experiment  and  these  are  believed  to  include  most  of  those 
likely  to  be  critical  under  operational  conditions.  Other 
considerations  in  the  interpretation  of  eta  squared  are  dis- 
cussed by  Simon  (1976b,  pp  37-43). 


Cumulative  Proportion  of  Variance 


When  the  sources  of  variance  are  ordered  from  largest  to 
smallest  according  to  the  size  of  each  one's  effect  and  the  pro- 
portion of  variance  accounted  for  by  each  is  calculated,  these 
proportions  may  be  added,  one  at  a time,  so  that  as  each  new 
source  is  added  incrementally,  the  cumulative  proportion  of 
the  total  variance  accounted  for  by  all  sources  of  variance, 
both  factors  and  interactions  strings,  up  to  that  point,  is 
indicated. 

Since  the  sources  in  a screening  design  are  independent, 
each  cumulative  proportion  of  variance  represents  the  square 
of  the  multiple  regression  coefficient  (R2)  for  an  equation 
composed  of  all  sources  included  up  to  that  point.  Each  new 
source  adds  some  incremental  amount,  which  may  or  may  not  be 
important  — whicn  is  what  the  investigator  is  trying  to 
decide  — and  which  may  in  fact  have  been  a chance  effect  for 
the  particular  sample  and  would  not  likely  re-occur  were  the 
experiment  repeated.  As  stated  earlier,  we  have  no  way  in 
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the  single  replication  screening  design  of  directly  measuring 
the  error  variance  with  which  to  test  the  reliability  of  each 
new  term.  Several  indirect  (or  internal  comparison)  measures 
will  be  suggested  later  on. 

When  we  stop  at  a particular  point  along  the  ordered 
continuum  of  sources  and  calculate  the  cumulative  proportion 
of  variance  (or  R2),  we  are  implicitly  assuming  — 
tentatively  at  least  — that  the  remaining  proportion  of  var- 
iance not  accounted  for  (i.e.,  1 - R2)  is  error.  This  estimate 
of  error  might  be  used  to  determine  at  what  point  the  addi- 
tion of  another  term  (source  of  variance)  results  in  a drop 
in  .he  population  R2,  which  is  estimated  by  applying  certain 
correction  factors  to  the  sample  R2.  Quite  obviously,  the  R2 
value  for  the  sample  must  increase  toward  1.00  as  more  sources 
of  variance  are  added,  but  the  population  R2  reaches  a point 
where  instead  of  increasing  as  more  sources  are  added,  will 
decrease.  This  could  be  used  as  a clue  as  to  where  to  stop 
adding  more  sources. 

While  there  are  a number  of  formulas  to  calculate 
"shrinkage"  (Kerlinger  and  Pedhazur,  1973;  Url  and  Eisenberg 
1970) , the  following  one  is  probably  as  effective  as  any  for 
our  present  purpose  and  is  simple  to  use: 

" 1 - 11  - r2> 

A 

where  R is  the  population  multiple  correlation  corrected  for 
shrinkage 

R is  the  uncorrected  sample  multiple  correlation  for 
the  k factors 

N is  the  total  number  of  observations 

k is  the  number  of  factors  (or  sources  of  variance) 
included  in  the  equation 
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A 

At  some  point,  as  k increases,  R2  will  begin  to  decrease. 

This  is  the  maximum  number  of  sources  that  should  be  considered. 
In  practice,  however,  this  formula  gives  an  overestimation , 
and  the  k sources  are  probably  too  many  to  include. 

Because  a successful  experiment  should  account  for  most* 
of  the  performance  variance,  there  is  often  a tendency  to 
want  to  include  more  sources  of  variance  than  are  probably 
necessary.  Still,  the  final  decision  of  what  to  include  or 
not  will  be  made  more  on  the  basis  of  practical  considerations 
and  the  dangers  of  an  erroneous  decision  than  on  the  results 
of  a statistical  test.  The  decision  is  made  more  difficult, 
however,  when  we  look  at  the  cumulative  proportion  than  at 
the  proportion  accounted  for  by  an  individual  term  factor. 

For  in  individual  cases,  we  may  see  a small  value,  e.g.,  a 
proportion  of  .01,  and  decide  that  even  if  it  were  a real 
effect,  it  is  marginal  and  if  we  omit  it  erroneously  it  is 
not  going  to  be  too  critical.  On  the  other  hand,  we  might 
hesitate  dropping  ten  or  fifteen  effects  that  individually 
might  each  account  for  a probability  value  cf  .01  or  less, 
since  cumulatively  they  might,  for  example,  account  for  .10 
to  .20  of  the  total  variance.  Luckily,  the  problem  is 
easier  to  resolve  in  the  screening  phase  when  we  are  only 
asking  whether  a particular  factor  should  be  included  in 


Without  more  experience,  what  proportion  should  be 
accounted  for  by  a screening  design  cannot  be  stated  with  any 
degree  of  confidence.  Still,  as  a personal  guess,  if  we 
started  with  a 30-factor  study  (and  an  astute  experimenter) , 
one  ought  not  to  be  happy  unless  one  accounts  for  more  than 
.80  of  the  variance  in  the  experiment  with  real  effects. 
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subsequent  studies  than  it  would  be  in  the  refinement  phase 
near  the  end  of  the  research  problem,  when  even  small  amounts 
(as  lonj  as  they  are  real  effects)  should  not  be  ignored. 

But  in  the  screening  phase,  if  even  a series  of  factors  shows 
a sizeable  cumulative  effect,  if  they  have  been  preceded  by 
a great  many  interaction  strings  each  with  meager  effects, 
and  occur  in  the  second  half  of  the  ordered  sources  of  var- 
iance, it  is  unlikely  that  any  effect  will  be  critical. 

Reverse  Yates1  algorithm.  Daniel  (1976,  p 73)  examines 
the  cumulative  proportion  of  variance  one  step  at  a time  using 
a reverse  Yates'  algorithm  as  a computational  aid.  Beginning 
after  a reasonable  number  of  terms  has  been  included  in  the 
cumulative  proportion,  he  calculates  the  predicted  value  at 
each  experimental  data-collection  point  in  the  design  and 
compares  it  with  the  empirically  obtained  value.  Calculating 
the  predicted  values  could  be  done  using  the  regression  equa- 
tion, however,  Daniel's  application  of  the  reverse  Yates' 
algorithm  is  the  same  as  for  the  forward  Yates'  with  the 
following  exceptions: 

1.  Begin  by  writing  the  effects  in  the 
Standard  Order,  but  inverted. 

2.  Read  off  the  estimated  values  at  the 
end  of  the  procedure  with  the  condi- 
tions in  an  inverted  Standard  Order. 

If  one  begins  this  reverse  analysis  with  the  effects,  then 
the  values  in  what  would  ordinarily  be  the  ef fects-total 
column  must  be  divided  by  N/2  to  get  the  estimated  per- 
formance values.  However,  if  instead  of  beginning  with 
the  effects  one  begins  with  the  regression  coefficients, 
then  no  division  is  required.  The  values  in  the  effect- 


78 


totals  position  of  the  reverse  Yates'  are  the  estimated 
performance  values.  For  example: 

FORWARD  YATES' 

Exptl. 


Cond. 

Perf . (y) 

1 

2 

(t2)  Effect 

Source 

(1) 

3 

11 

18 

9 

2(M)  * 

a 

8 

7 

2 

1 

A 

b 

5 

5 

-4 

-2 

B 

ab 

2 

-3 

-8 

-4 

AB 

REVERSE 

YATES ' 

Est. 

Exptl. 

Source 

Effect 

1 

2 (t2) 

Perf.  y 

Cond. 

AB 

-4 

-6 

4 

2 

ab 

B 

-2 

10 

10 

5 

b 

A 

1 

2 

16 

8 

a 

2(M)  * 

9 

8 

6 

3 

(1) 

The  proportions  of  variance  accounted  for  by  A,  B,  and  AB 
are  .048,  .190,  and  . 762,  respectively . If  these  were  ordered 
from  largest  to  smallest,  the  cumulative  proportion  would  be: 


AB  .762 
AB  + B .952 
AB  + B + A 1.000 


In  this  simplified  example,  Daniel  might  propose  to  find  out 
what  the  estimated  performance  vould  be  for  each  condition 
if  we  assume  that  A is  actually  zero  for  all  practical 
purposes.  Using  the  reverse  Yates'  he  would  get: 


*Value  in  Effect  column  is  twice  the  value  of  the  mean. 
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Exptl. 


Source 

1 

2 

3_ 

(*2)  ±_ 

_iL_  Y> 

Cond. 

AB 

-4 

-6 

3 

1.5 

2 .5 

ab 

B 

-2 

9 

11 

5.5 

5 -.5 

b 

A 

0 

2 

15 

7.5 

8 .5 

a 

(M) 

9 

9 

7 

3.5 

3 -.5 

(1) 

A 

where  y is 

the 

estimated 

value  and 

y is  the  obtained  one.  He 

would  test 

to  see  if 

the 

residual , 

(y-y) , could 

be  tolerated 

or  not,  and  thereby 

decide  whether 

the  dropped 

variable,  A, 

can  be  excluded  or  not.  In  this  artificial  example  there 
was  no  mean  difference  and  no  source  of  error  variance,  so  no 
significance  test  would  be  meaningful.  In  the  case  of  larger 
designs,  however,  this  is  yet  another  tool  to  help  the  in- 
vestigator judge  whether  to  include  or  omit  a source  of 
variance. 

Daniel  also  uses  this  calculation  to  discover  whether 
there  are  distortions  in  the  data  and  whether  transformations 
could  be  used  to  simplify  the  model.  In  particular,  he 

A 

plotted  the  residuals  (i.e.,  the  y - y)  against  their  cor- 
responding performance  (y)  values  as  proposed  by  Anscombe 
and  Tukey  (1963) , and  also  their  distribution  on  a normal 
cumulative  distribution  grid.  He  next  searched  these  for 
patterns  that  would  be  indicative  of  distortions  in  the  data. 
While  the  study  of  residual  patterns  is  an  important  part  of 
the  data  analysis  process,  no  further  discussion  on  this 
topic  will  be  given  in  this  report.  It  is  described  in 
detail  in  Daniel’s  (1976,  pp  71)  book. 

Interpreting  the  cumulative  effects  of  non-critical 
factors.  There  is  something  disconcerting  when  it  is  dis- 
covered that  the  non-critical  factors,  (i.e.,  the  ones  that 
individually  account  for  only  a small  proportion  of  the  var- 
iance) in  combination,  account  for  a large  chunk,  perhaps  .30, 
of  the  bptal  variance.  That  is  a great  deal  of  unexplained 
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variance  and  it  may  cause  an  investigator  to  think  that  pos- 
sibly some  of  the  non-critical  factors  may  be  marginal  ones 
of  minor  but  practical  interest.  He  may  wish  to  examine 
these  non-critical  factors  in  order  to  decide  which  he  still 
believes  are  trivial  and  which  might  be  considered  real  but 
"marginal."  Some  considerations  in  this  regard  are  listed 
below; 


1. 


2. 


3, 


The  small  effect  may  in  fact  be  trivial,  a chance 
perturbation.  It  is  unlikely  to  be  found  on  sub- 
sequent tests. 


A noticeable  effect  might  be  due  to  error,  an  in- 
frequent and  intermittent  disturbance  in  a few  cells, 
affecting  by  chance  a particular  effect  or  two. 

For  example,  momentary  losses  of  attention  on  the 
part  of  the  subject,  an  irrelevant  but  intermittent 
occurence  in  the  environment,  or  erroneous  settings 
of  the  simulation  equipment.  The  momentary  effects 
are  large,  but  are  averaged  down  in  the  analysis. 

An  examination  of  the  raw  data  or  a half-normal 
plot  may  reveal  this. 


The  effect  may  reflect  an  unexpected  confounding 
with  some  concomitant,  systematic,  but  irrelevant 
source  of  variance.  This  might  not  occur  if  the 
study  were  repeated  and  can  often  be  avoided  with 
better  planning  during  the  problem  definition  phase. 
The  size  of  the  observed  effect  may  be  distorted  due 
to  the  confounding  effect  a)  inflating  a factor's 
otherwise  trivial  effect,  or  b)  deflating  an  impor- 
tant factor's  effect. 
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4.  The  effect  may  be  reliable,  but  small.  More  measures 
will  be  required  to  see  if  the  effect  occurs  con- 
sistently. It  might  have  been  larger  had  a differ- 
ent part  of  the  operational  space  been  included  in 
the  experiment.* 


The  investigator,  faced  with  the  decision  to  include  or  ex- 
clude the  marginal  factors,  realizes  that: 


a)  If  he  includes  a marginal  factor,  he  adds  to 
the  expense  in  subsequent  efforts  that  must 
allot  more  observations  to  study  that  factor, 
more  time  to  change  the  factor  during  the  ex- 
periment, and  more  money  and  manpower  to  build 
and  maintain  the  factor  into  the  simulation. 

If  there  are  no  major  expenses  associated  with 
the  inclusion  of  a marginal  factor,  then  it 
might  as  well  be  included.  If  it  is  a wrong 
decision  to  include  it,  i.e.,  if  it  is  not  a 
reliable  effect,  it  can  be  deleted  later. 


If  he  excludes  a marginal  factor,  he  will  be 
able  to  reduce  the  size  of  subsequent  studies 
and  possibly  their  costs,  but  if  it  is  a real 
effect,  his  ability  to  predict  will  be  reduced. 
Since  it  is  a marginal  factor,  the  error  — to 
exclude  or  include  incorrectly  — will  be  rel- 
atively small.  The  balance  arises  when  the 
fewest  factors  account  for  most  of  the  variance 
in  the  experiment.  By  building  a framework  — 


This  does  not  mean  that  one  should  artificially  extend 
the  boundaries  of  a factor  just  to  get  a larger  (or  more 
significant)  effect.  We  wish  to  order  effects  by  their  size 
within  a particular  operational  space. 
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a response  surface  — involving  these  critical 
variables,  the  marginal  factors  can  be  intro- 
duced into  it  at  a later  phase  of  the  research 
program  — to  refine  the  original  equation  — 
when  they  can  be  investigated  more  thoroughly 
and  with  more  precision  than  if  they  had  been 
entered  early  during  the  screening  phase. 


Costs,  interest,  probable  impact,  difficulty,  realism, 
reliability  and  so  forth,  are  all  weighed  in  the  inclusion/ 
exclusion  decision  regarding  marginal  factors. 


Half-normal  Plots  (Daniel) 


When  a large  number  of  effects  are  being  investigated, 
the  largest  effects  can  be  several  times  larger  than  the  av- 
erage even  when  no  effect  is  real.  In  an  experiment  with  31 
effects,  the  size  of  the  largest  effect  could  be  2.4  times 
larger  than  the  average  size  when  in  fact  the  difference  was 
due  only  to  chance.  Using  the  traditional  .05  significance 
level  in  such  an  experiment  would  cause  unreal  effects  to  be 
judged  real  in  over  half  of  all  experiments  done  (Daniel, 
1959,  p 312).  While  an  examination  of  mean  different  a and 
eta  squared  values  can  help  the  investigator  avoid  trivial 
effects,  these  measurements  do  not  provide  sufficient  data 
to  protect  the  investigator  from  including  effects  which  may 
appear  to  be  non-trivial  but  which  are,  in  fact,  chance 
deviations. 


Conventionally,  t-  or  F- tests  are  used  to  protect  the 
investigator  from  overenthusiasm  regarding  a large  effect. 
Since  economy  is  of  paramount  importance  and  replication  is 
avoided  in  the  screening  design,  there  is  no  internal  data 
with  which  to  estimate  the  error  variance  needed  for  the 
significance  test.  In  the  physical  sciences,  error  variance 
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can  be  estimated  from  the  results  of  other  experiments 
studying  the  same  problems;  this  would  be  foolhardy  to  try 
in  psychology.  Psychologists,  who  run  unreplicated  factorial 
designs,  often  use  higher-order  interactions  — generally 
more  than  three  factors  — to  estimate  the  error  variance. 

This  is  done  on  the  assumption  that  the  effects  of  these 
interactions  are  negligible.  However,  in  screening  designs 
this  is  not  possible  since  all  higher-order  interactions  are 
confounded  with  main  and  two-factor  interaction  effects. 

Of  course,  if  any  strings  of  two-  and  three-factor 
interactions  are  trivial,  they  can  be  used  to  estimate  error. 
But  here  we  are  faced  with  an  enigma  since  we  have  no  error 
term  to  test  whether  these  interaction  strings  are  trivial. 
Birnbaum  (1959)  suggests  that  instead  of  assuming  that  certain 
interactions  are  zero,  an  inference  procedure  be  used  which 
assumes  that  a specified  number  of  effects  out  of  a total 
number  are  non-zero.  However,  he  develops  the  mathematics 
only  for  the  case  where  it  is  necessary  to  discover  whether 
one  effect  out  of  many  is  real  or  not.  He  concluded  that  his 
statistic  in  that  situation  would  be  about  as  sensitive  in 
detecting  one  real  effect  among  thirty-one  effects  (if  one 
real  one  were  present)  as  traditional  multiple  t-tests  were 
capable  of  detecting  one  among  15  possible  effects  with  ten 
degrees  of  freedom  for  error,  or  one  from  31  possible  effects 
with  over  20  degrees  of  freedom  for  error.  We  are,  of  course, 
more  interested  in  those  situations  where  more  than  one 
source  of  variance  is  likely  to  be  critical. 

Daniel  (1956?  1959)  developed  a graphic  method  (corres- 
ponding in  principle  to  Birnbaum' s statistic)  for  examining 
the  results  from  an  unreplicated  design  to  help  judge  the 
reality  of  the  largest  main  effects  and  interactions,  and  to 
indicate  the  piesence  of  unruly  data.  His  method  is  to 
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graphically  compare  the  empirically  derived  cumulative  distrib- 
ution of  the  effects  with  a cumulative  distribution  derived 
from  a normally  distributed  population.  To  do  this,  the 
results  from  the  experiment  are  plotted  on  "half-normal  grid" 
paper. 

Preparing  half-normal  grids.  The  steps  to  produce  a 
half-normal  grid  are  as  follows: 

1.  Obtain  a sheet  of  Probability  Scale  graph  paper. 

This  paper  is  produced  commercially  (e.g.,  Keuffel 
and  Baser  Co.,  #358-23).  On  this  paper,  a graph  of 
the  theoretical  normal  distribution  would  be  a 
straight  line  through  the  origin. 

2.  Use  that  portion  of  the  grid  that  begins  with  the 
probability,  P,  of  50  and  goes  up  to  a value  greater 
than  99.  (Note:  These  "probability"  values,,,, of 
course,  are  multiplied  bv  100  to  eliminate  having  to 
print  the  decimal.) 

3.  Rescale  the  graph  paper  with  new  probability  values, 
P',  calculated  from  the  old  values,  P,  where 

P«  « 2P  - 100. 

For  example,  P * 70,  and  P'  ■ 2 x 70  - 100  * 40. 

4.  Locate  the  P'  along  the  ordinate  of  the  grid  where 
each  ordered  effect  (i.e,,  ordered  contrast)  must  lie. 
A different  set  of  values  is  required  for  each 
analysis  in  which  the  total  number  of  effects  is 
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different.  The  equation  to  find  the  P'  value  for 
each  particular  rank  ist 

P'  - [ (R  - 0.5)/(N-l) ] X 100*  - 

where  R is  the  rank  of  the  ordered  effect  and  (N-l) 
ia  the  number  of  effects  that  will  be  plotted;  it 
is  also  the  total  degrees  of  freedom  with  N obser- 
vations . 

For  example,  in  a 21&I“j1  screening  design,  there  are 
31  effects  to  be  plotted.  The  largest  effect,  ranked 
31,  would  be  plotted  at  P*  ■ 1(31  - 0.5)/31]  X 100  » 
98.39.  The  effect  tenth  from  the  top,  rank  22,  would 
be  plotted  at  P*  - 1(22  - 0.5) /31]  X 100  - 69.35. 

For  a 2 3jy26  screening  design,  with  63  effects,  the 
effect  ranked  22  would  be  plotted  at  P*  * t (22  - 0.5)/ 
63]  X 100  ■ 34.13.  The  P and  P*  values  for  all 

ranks  of  designs  with  15,  31,  and  63  degrees  of 
freedom  (and  effects)  are  given  in  Table  9.  P values 
are  probabilities  (X  100)  for  each  rank  plotted  on 
normal  probability  grids.  P'  values  are  the  corres- 
ponding probabilities  (X  100)  plotted  on  half-normal 
grids.  An  example  of  a 31-effect  grid  is  shown  in 
Figure  6. 


*To  determine  the  standard  score,  z,  of  each  rank 
position  on  a unit  normal  curve  (where  the  N and  standard 
deviation  are  both  assumed  to  be  1)  we  may  refer  to  any 
normal  distribution  table  such  as  Beyer  (1966,  p 117)  and 
look  up  the  P — not  the  P*  --  value  (f  100)  associated  with 
that  rank.  For  example,  in  the  above  illustration 

If  P'  « 40,  P = 70/  then  z * .52. 

2-values  can  be  used  to  determine  the  height  of  each  rank 
position  above  zero  on  the  ordinate  of  a half-normal  grid 
which  could  be  drawn  directly  rather  than  by  extracting  them 
from  a plot  on  normal  probability  paper.  The  z-value  will 
also  be  useful  later  in  this  paper  when  Zahn's  work  is 
discussed. 
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TABLE  9 

PROBABILITY*  VALUES  FOR 
CONSTRUCTING  HALF-NORMAL  GRIDS** 


f d.r.  « i$  1 

<i.r.  " 31  1 

<t.r.  *• 

6j 

fonk 

r i 

13 

rook 

r* 

? 

)M*k 

r* 

i 

y-iok 

p 

15 

95.5? 

9^.33 

31 

98.39 

99.19 

63 

99.21 

99.60 

14 

9-”'.  00 

95.  co 

30 

95.16 

97. 58 

62 

97.62 

98.81 

31 

48.41 

74.21 

13 

S3. 33 

91.67 

29 

91.9'* 

95.97 

61 

96.03 

98.02 

30 

46.83 

73.41 

12 

76. f? 

69.33 

28 

88.71 

94.35 

(0 

94.44 

97.22 

29 

45.24 

72.62 

•11 

70.00 

85.00 

27 

85.48 

92.74 

59 

92.86 

96.43 

20 

43.65 

71.83 

10 

6303 

81.67 

26 

82.26 

91.13 

58 

91.27 

95.63 

27 

42.06 

71.03 

9 

56.67 

78.33 

25 

79.03 

«9.52 

57 

89.68 

94,84 

26 

40.48 

70.24 

0 

50.0-' 

75.00 

24 

75.81 

97.90 

?6 

88.09 

94.05 

25 

38.89 

69.44 

7 

<•3.33 

71.67 

23 

72.58 

86.29 

55 

86.51 

93*25 

24 

3700 

£8.6; 

6 

36.67 

(8.33 

• 2 2 

69.3$ 

84.68 

5!* 

84,92 

92.46 

23 

35.71 

67.56 

5 

30.00 

65.00 

21 

(6.13 

83.06 

53 

8303 

91.  (7 

22 

34.13 

67.06 

4 

23.33 

61.6? 

20 

62.90 

81. 45 

52 

81.75 

90.8? 

21 

32.54 

66, 2”? 

3 

16.67 

59.33 

1? 

59.68 

79.84 

51 

80.16 

90.08 

20 

30.9$ 

65.4° 

2 

10.00 

55.00 

18 

56.45 

76.23 

50 

78.57 

89.29 

19 

29.37 

64,*° 

1 

3.33 

51.(7 

17 

53.2.3 

76.(1 

49 

76,98 

SB. 49 

18 

27.70 

(3.06 

0 

0 

50.00 

16 

59.00 

75.00 

46 

75.40 

87. ’0 

17 

26.19 

63.1: 

15 

46.73 

73.39 

4? 

73.01 

86. ?0 

16 

24.60 

62.3? 

14 

43.55 

71.7’ 

46 

72.22 

86.11 

15 

23.02 

61.51 

* 

13 

40.32 

70.16 

45 

70.63 

8502 

14 

21.43 

£-0.71 

12 

37.00 

(8.55 

• 44 

69.05 

84.52 

13 

19.34 

5r.4> 

11 

33.87 

66,94 

43 

67.46 

83.73 

12 

18.25 

59.1? 

10 

50.65 

65.33 

42 

(5.87 

82.54 

11 

16.6? 

$0.3? 

• 

9 

27,42 

63.71 

41 

64.29 

8?.  14 

10 

15.08 

57.54 

8 

24.10 

(2,10 

40 

62.70 

81.35 

9 

13.49 

$6.75 

7 

27.97 

60. 48 

39 

61.11 

80. 56 

8 

11.90 

55.95 

6 

17.74 

58.87 

38 

59.52 

79.76 

7 

10.32 

55*16 

5 

14.52 

57.26 

37 

57.94 

78.97 

6 

8.73 

54.3” 

4 

11.29 

55*65 

36 

56.35 

70.17 

5 

7.14 

53.5’ 

3 

0.06 

54.03 

35 

5;'.7( 

77.38 

4 

5.56 

52.73 

2 

4.C4 

52.42 

34 

53.17 

76.59 

3 

3.97 

51.9? 

1 

1,61 

5<-,81 

33 

51.59 

75.79 

2 

208 

51.19 

0 

0 

50.00 

32 

$0,00 

75.00 

1 

.79 

50.4? 

' 

0 

0 

50. O'* 

*P  values  are  probabilities  (X  100)  to  be  used  on  normal  prob- 
ability grids.  Adjacent  P1  values  are  probabilities  (X  100)  at  the 
same  rank  when  half-normal  probability  grid  is  used. 

**If  normal  probability  paper  is  not  available,  grids  may  be  con- 
structed directly  by  finding  the  z-score  equivalent  to  the  P-value  (r  100) 
for  each  rank  and  using  it  to  measure  off  the  distance  on  the  ordinate 
scale.  2-scores  can  be  found  in  most  normal  distribution  tables,  e.g., 
Beyer  (1966,  p 117). 
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5.  Write  a scale  along  the  abscissa  of  the  grid  that 
covers  the  range  of  absolute*  values  of  the  effects. 

Plotting  the  data.  The  absolute  effects  obtained  from 
the  analysis  of  the  experimental  data  are  ordered  from  largest 
to  smallest  and  given  the  ranks  from  (N-l)  to  1,  respectively. 
The  coordinates  of  a point  representing  the  largest  effect  — * 
ignoring  signs  — would  be  where  the  P'  for  the  highest  rank 
(along  the  ordinate)  and  the  proper  abiolute  value  (along  the 
abscissa)  intersect.  Each  subsequent  effect  is  plotted  on  the 
line  of  its  appropriate  rank.  Daniel  (1959,  p 314)  suggests 
that  it  is  not  necessary  to  plot  every  one  of  the  smaller 
effects  at  the  lower  ranks  since  they  tend  to  be  correlated. 

The  mean  1b  not  plotted  but  block  differences  and  higher- 
order  interaction  strings,  if  they  exist,  are. 

Interpreting  half-normal  plots.  If  none  of  the  effects 
in  the  experiment  are  real,  that  is,  if  the  sizes  of  the 
effects  are  no  greater  than  might  be  expected  to  occur  by 
chance,  the  standard  deviation  of  the  values  would  be  approx- 
imated by  value  of  the  effect  at  the  rank  order  nearest  to 
the  P <*  68.3  quantile.  In  other  words,  the  standard  deviation 
would  equal  the  value  of  the  effects,  XRI  when  R « .683(N-l)-0.5. 
For  15,  31,  and  63  degrees  of  freedom  (or  N-l)  this 

would  be  the  value  at  rank  positions  11,  22,  and  4^  respec- 
tively. Under  the  null  hypothesis,  therefore,  the  plotted 
points  would  theoretically  approximate  a straight  line 
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The  effects  are  ordered  disregarding  signs. 
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through  the  origin  and  the  point  made  by  tne  rank  at  the 
68.3  quantile.  This  straight  line,  the  "chance"  line,  ia 
the  cumulative  distribution  of  a normal  curve  (the  classic 
S-shaped  curve)  as  it  would  appear  when  plotted  on  proba- 
bility paper.  Daniel  (1959,  p 316)  plotted  ten  samples  of 
31  effects  from  purely  random  data.  While  the  average  of 
the  ten  approximates  a straight  line  very  well,  individually 
they  wander  about  the  line  in  an  irregular  pattern,  though 
not  enough  to  be  misinterpreted  as  being  real  effects. 

In  practice,  since  some  effects  may  be  real,  we  do  not 
know  exactly  where  the  slope  of  the  line  should  be.  Instead, 
we  allow  the  data  to  determine  where  the  straight  line  will 
lie.  It  would  be  drawn  by  eye  through  points  representing 
the  smaller  half  of  the  ranked  data,  ordinarily  these  should 
go  through  the  origin,  but  occasionally  may  not.  The 
farther  the  larger  effects  deviate  to  the  right  of  this  line, 
the  more  probable  it  is  that  they  did  not  occur  by  chance 
and  are  in  fact  reliable  effects. 

Interpretation  tactics.  Krane  (1963)  who  adapted  the 
use  of  half-normal  plots  to  multi-level  factorial  experi- 
ments, suggests  an  iterative  approach  to  the  selection  of 
the  real  effects.  He  examines  the  largest  point  first  to 
see  if  it  lies  far  enough  to  the  right  of  the  line  to  be 
judged  real,  and  if  so,  removes  it  and  replots  the  remaining 
effects  and  again  decides  if  the  largest  of  the  remaining 
effects  deviates  far  enough  from  the  straight  line  to  be 
judged  real.  This  continues  until  he  no  longer  believes 
that  an  effect  is  real. 
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In  practice,  he  may  make  a crude  test  of  a number  of  the  | 
largest  effects  by  drawing  a vertical  line  from  vhe  point  | 
where  the  horisontal  line  on  which  the  largest  rank  is  located  1 
intersects  the  empirically  constructed  cumulative  distribution  t | 

line.  He  then  considers  only  those  effects  lying  to  the  right  I | 

i $ 

of  the  vertical  line.  Next,  he  replots  the  effects  after  i t 
having  eliminated  those  largest  effects  already  judged  to  be  ! J 
real,  draws  a new  line  and  again  judges  whether  effects  to  j I 


the  right  are  real. 

In  the  replotting,  since  there  are  fewer  cases  each  time, 
the  position  of  the  rank  order- lines  on  the  P1  scale  must 
change.  For  example,  the  31st  line  in  Figure  6 is  at  P * 99.19 
on  the  full-normal  probability  scale  or  P'  » 98.39  on  the  half- 
normal plot.  These  values  can  be  found  in  Table  9 of  the 
effect  on  the  31st  rank  is  considered  real  and  is  removed,  and 
the  remaining  30  values  are  replotted,  the  probability  position 
of  the  30th  rank  is  now  based  on  an  (N-l) *30  rather  than  31. 
Therefore,  it  cannot  be  plotted  on  the  original  half -normal 
grid  in  Figure  6.  The  new  P'  for  each  rank  must  be  replotted 
using  the  equation: 

P'  - ( (R  - 0,5)/  (N-l) ] X 100 

as  was  done  before.  Or,  if  it  i9  apparent  that  the  first  four 
largest  effects  can  be  removed,  then  a new  P'  value  for  the 
rank  27  effect  would  have  to  be  calculated.  P’  wouli  be  98.15. 

Since  a special  grid  has  not  been  prepared  for  any  size 
other  than  31,  the  reader  can  make  his  own  by  marking  off  the 
correct  grid  on  the  upper  half  of  the  normal  probability 
paper.  In  this  case,  he  would  have  to  work  backwards  in  his 
calculations,  first  determining  what  the  P’  value  would  be 


for  a particular  rank  and  a particular  (N-l) , and  then 
finding  that  position  on  the  half  of  the  normal  probability 
paper  at  P,  where  P » 0.5(P'  + 100). 


To  facilitate  this  effort  however,  calculations  of  P’ 
and  P for  the  first  four  largest  ranks  for  values  of  (N-l) 
from  63  down  to  4 is  given  in  Appendix  V,  For  example,  if 
(N-l)  equals  27,  then  from  Appendix  V we  would  plot  on  one 
half  of  a piece  of  full-normal  probability  paper  the  first 
four  largest  ranks  at  the  following  positional 


Rank 

P 

27 

99.07 

26 

97.22 

25 

95.37 

24 

93.52 

and  assign  the  new  P'  » 2P-100. 

Detecting  defective  values.  Krane  (1963,  p 284-285) 
discusses  Daniel's  1966  conclusions  regarding  the  use  of 
half-normal  plots  to  detect  defective  values.  These  are 
cited  here  briefly  to  inform  the  reader  who  may  be  interested 
in  pursuing  this  form  of  analysis  on  his  own  but  for  whom 
Krane 's  paper  may  not  be  readily  available.  Krane  noted  that 
the  half-normal  plot  of  an  experiment  involving  a number  of 
small  but  real  interactions  may  appear  very  similar  to  the 
results  induced  by  plot-splitting,  because  "split  plot  error 
contrasts  invariably  contain  a relatively  larger  number  of 
the  higher  order  contrasts."  He  added.  "Our  practice  is 
generally  to  employ  a split  plot  analysis  only  when  knowledge 
of  the  experimental  techniques  indicate  its  propriety." 

Krane  also  noted  that  because  his  analysis  was  usually  based 
on  transformed  data,  he  seldom  experienced  the  downward  con- 
vexity of  half-normal  plots  that  Daniel,  in  his  1966  paper, 
believed  indicated  the  presence  of  an  antilognormal  distribu- 
tion of  error.  Krane  pointed  out,  on  the  other  hand,  that 
"the  removal  of  a moderate  number  of  points  representing 
apparently  real  effects  often  results  in  a downward  convexity 
of  the  upper  portion  of  the  plot.  We  generally  attribute 
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this  appearance  to  the  inadvertent  removal  of  one  or  more 
points  representing  error  contrasts,  for  the  results  look 
very  much  like  the  plot  of  a normal  distribution  with 
truncated  upper  tail." 

Daniel,  in  1959,  felt  that  half-normal  plots  could  be 
used  to  detect  defective  values  in  the  data.  By  the  time  he 
had  published  his  book  in  1976,  he  no  longer  believed  that 
to  be  the  case.  In  his  book,  Daniel  (1976,  p 149)  felt  that 
"the  signed  contrasts  in  standard  order  have  more  information 
in  them  than  do  the  unsigned  contrasts  ordered  by  magnitude." 

He  spends  a good  part  of  his  book  showing  how  residual  analysis 
can  be  used  to  detect  distorted  experimental  values.  This 
should  be  an  important  part  of  the  analysis  of  any  experimental 
data  and  can  be  of  particular  value  in  studies  employing 
economical  multi factor  designs  with  minimum  replication. 

Anscombe  and  Tukey  (1963)  also  treat  the  subject  of  residual 
analysis.  This  topic  will  not  be  treated  in  this  report. 

Both  of  Daniel's  books  (1976?  Daniel  and  Wood,  1971)  are 
recommended  reading  for  anyone  analyzing  applied  experimental 
data.  Unlike  the  authors  of  many  textbooks  on  statistics, 

Daniel  discusses  and  deals  with  the  interpretation  problem 
from  a practical  point  of  view  based  on  years  of  experience. 

Standardized  Half-Normal  Plot  (Zahn)* 

If  Daniel  had  proposed  no  more  than  the  foregoing  dis- 
cussion of  half-normal  plots,  he  would  have  made  a major 
contribution  to  the  analysis  and  interpretation  of  unreplicated 
screening  design  data.  At  the  least,  this  type  of  plot  warns 
the  user  that  large  effects  might  in  fact  be  due  to  chance. 

At  the  most,  in  this  computer  age,  it  encourages  the  investi- 
gator to  engage  in  that  almost  forgotten  art  of  studying  his 
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Just  when  this  report  was  ready  to  go  to  press,  the  I 

papers  by  Zahn  (1975a,  1975b)  were  discovered.  Zahn's  work  J 

(continued  on  next  page)  i 
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data  directly.  But  Daniel  did  not  stop  there.  Instead  he 
proposed  the  concept  of  a "standardized"  half-normal  plot 
(Daniel , 1959,  p 322). 

Daniel  proposed  that  a scale-free,  standardized  half- 
ncj.mal  grid  be  used  on  which  fixed  limits  could  be  placed 
to  identify  how  far  from  the  line  deviant  effects  must  be 
to  have  a specified  probability  of  being  a real,  rather 
than  a chance,  effect.  The  advantage  of  this  plan  is  that 
it  facilitates  comparisons  among  sets  of  data  using  differ- 
ent criteria.  Furthermore,  it  serves  as  a graphic  test  of 
statistical  significance,  alerting  the  investigator  to  the 
possibility  that  he  might  be  making  Type  I errors. 

In  tho  standardized  version,  Daniel's  premise  was  that 
with  no  real  effects  present  in  the  data,  the  standardized 
values  of  the  absolute  effects,  when  plotted  on  a half- 
normal grid,  would  lie  along  a straight  line  through  the 
origin  and  the  coordinate  of  the  ordered  effect  at  the  rank 
having  the  value  approximating  the  standard  deviation  of 
the  data.  The  standardized  values  are  obtained  by  dividing 
the  absolute  effects  by  the  estimated  standard  deviation. 
Daniel  estimated  the  standard  deviation  to  he  the  value  of 
the  effect  at  rank,  R ■ .683  Y + 0.5  (with  Y » the  largest 
possible  rank  for  the  set  of  data) . For  data  involving 


points  out  flaws  in  Daniel's  method  of  producing  "standard- 
ized half-normal  plots."  Since  it  is  believed  that  half- 
normal plots  are  powerful  tools  for  interpreting  unreplicated 
screening  data,  the  original  discussion  regarding  Daniel's 
method  was  removed  from  this  report  and  this  brief  notation 
regarding  Zahn's  work  was  introduced  in  its  place.  The 
reader  is  encouraged  to  read  Zahn's  original  papers  and  to 
use  his  version  of  the  "standardized  half-normal  plots." 
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15,  31,  and  63  af facta,  tha  standard  deviations  would  be 
approximated  by  tha  values  at  ranks  11,  22,  and  44,  respec- 
tively, whan  no  real  effects  are  present  in  the  data. 

Based  on  tha  theoretical  work  by  Birnbaum  (1959) , 

Daniel  (1959,  p 322)  provides  the  data  for  calculating 
probability  guidelines  — "guardrails"  — which  indicate 
the  limits  above  the  "chance"  line  at  which  points  may  fall, 
purely  by  chance,  a specified  proportion  of  the  time.  This 
is  a form  of  graphic  significance  test. 

Zahn  (1975a,  1975b)  recently  proposed  modifications  to 
Daniel's  version  of  the  standardized  half-normal  plot.  He 
notes  a minor  flaw  in  the  plotting  positions  and  a major 
flaw  in  the  method  of  calculating  the  guardrails  for  the 
standardized  half-normal  plots.  Zahn  describes  two  versions  — 
X and  S — of  his  own,  but  based  on  an  empirical  study,  he 
concludes  that  his  version  S is  the  superior  one  (Zahn, 

1975b,  p 210) . The  difference  is  primarily  in  the  way  the 
standard  deviation  is  calculated. 

Zahn  (1975a)  proposes  these  changes  in  Daniel's 
approach  to  standardized  half-normal  plots.  Two  minor 
changes  are : 

1,  Reorient  the  position  of  the  grid  so  that  effects 
are  on  the  ordinate  axiB  and  the  rank  orders 
are  on  the  abscissa  axis.  This  corresponds,  he 
felt,  "to  the  usual  regression  analysis  graph 
on  which  the  random  variable  is  plotted  as  the 
ordinate"  (p  191) . He  also  suggests  using  the 
raw  effect  values  rather  than  the  standardized 
score  be  used. 
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2*  Make  minor  changes  in  the  plotting  positions 

(i.e.,  ths  t-vsluss  of  P and  PM  on  Daniel's  grid 
sinos  ths  standardised  offsets  that  Danisl  usss 
ars  jjot  actually  half-normally  distributed. 

Sahn  (p  192-192)  rscommsnda  minor  ohangss  when 
there  ars  IS  offsets  and  nons  whsn  thsrs  ars  20 
or  mors  offsets  to  bs  plotted. 

For  n * IS i ths  ranks  and  Danisl* s a-valuss  ars  shown 
bslow  along  with  Sahn's  (1975,  p 192,  Table  2)  recommended 
s-valuss  for  ths  now  plotting  positional 


Rank 

Daniel's  s 

Sahn's  a 

15 

2.12 

2.050 

Q 

14 

1.64 

1,626 

13 

1.39 

1.376 

12 

1.19 

1.191 

ll 

1.04 

1.040 

e 

10 

.90 

.910 

9 

.78 

.794 

9 

.67 

.688 

7 

.57 

.589 

e 

6 

.48 

,496 

5 

.39 

.408 

4 

.30 

,322 

3 

.21 

.239 

0 

2 

.13 

,158 

1 

.04 

.079 

P and  P'  values  associated  with  Daniel's  s' a (d.t,  « 15)  can 
be  found  in  Table  9,  this  report.  These  values  would  shift 
for  Sahn's  a.  However,  given  the  s-values,  there  is  no 
reason  to  obtain  the  probability  values, 
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The  two  major  changes  are: 


Daniel  makes  hit  initial  estimate  of  the  standard 
deviation  of  the  data  on  the  basis  of  a single 
value,  the  effect  at  the  rank  position  closest  to 
P'  ■ 68.3.  For  more  stability,  Zahn  proposes,  in 
his  version  S,  to  use  a value  based  on  the  slope 
of  the  ordinary  least  squares  regression  line 
through  the  origin  of  the  standardised  half-normal 
grid,  and  fitted  to  the  points  of  the  smallest  non- 
real  (Herror  contrasts'1)  effects,  i.e.,  from  the 
lowest  rank,  1,  up  to  rank  a,  where  a equals 
[0.683(n  + 1)3. 


The  estimated  standard  deviation  so  defined  is: 


f / 5 
o,  . * y x,».  / y a 

U,n)  ^ i itj  ^ 


where 


a ■ 0.683 (n  + 1)  » number  of  effects  to  be 
fitted 

r « largest  rank 

* absolute  effect  at  rank  i 

s.  * standard  score  of  rank  i on  unit  normal 
probability  curve  (see  footnote,  page 
86,  this  paper) 


2.  Zahn  (p  195)  proposes  different  criteria  for  deter- 
mining the  guardrails  and  therefore  computes  new 
guardrails.  The  guardrails  represent  the  distance 
above  the  "chance"  line,  (i.e.,  the  line  through 
the  smallest  non-real  effects)  at  which  different 
probabilities  of  making  a Type  1 error  would  occur 
if  effects  plotted  above  those  guardrails  were 
hypothesised  as  real.  Specifically,  Daniel's 


approach  failed  to  take  into  account  the  fact  that 
in  the  single  experiment  we  are  trying  to  estimate 
whether  a family  of  effects  is  significant.  The 
probability  error  rate  (PER)  is  the  probability  that 
there  is  at  least  one  false  positive  in  the  family 
of  statements.  Daniel's  guardrails  have  a valid  PER 
only  if  no  real  contrasts  are  present.  They  were 
appropriate  for  detecting  one  false  positive.  In 
screening  designs , we  expect  more  and  thus  we  would 
want  to  employ  a different  PER.  For  example#  if  we 
wish  to  have  the  Type  I error  rate  for  k ■ 9 real 
effects  to  be  a « .05#  then  the  guardrail  beyond 
which  significant  effects  would  be  located  on  the 
grid  would  have  to  have  a probability  error  rate  of: 

PER  - 1 - (1  - a)*  - .37 

Zahn  (1975a)  uses  rather  elaborate  statistics  to 
calculate  the  guardrails  for  his  version  X (p  196)  and  an 
empirical  Monte  Carlo  sampling  study  to  determine  the 
guardrails  for  his  version  S (p  197)  . He  does  provide  the 
critical  values  by  which  new  guardrails  (for  PER  « 0.05, 

0.20#  and  0.40)  can  be  plotted  for  N ■ 15  for  his  version 
S model  and  N • 15,  31#  and  63  for  his  version  X model. 

These  are  provided  in  Appendix  VIII. 

Zahn  (1977)  stated  that  he  had  done  little  with  this 
work  since  the  papers  were  published.  As  far  as  he  knew# 
no  one  had  determined  critical  values  for  N ■ 31  or  63  for 
his  version  S model.  He  suggested  that  the  guardrails  for 
version  X might  be  used  instead#  along  with  the  more 
reliable  version  S estimate  of  the  standard  deviation,  as 
long  as  the  investigator  realizes  that  version  S requires 
slightly  larger  effects  than  version  X for  the  same 
significance  level.  The  differences  for  N » 15  can  be 
observed  by  comparing  the  values  in  his  Table  5 and  Table 
7 (also  reprinted  in  Appendix  VIII  of  this  paper)  or  by 
studying  the  plots#  shown  in  his  Figures  4 a and  9. 
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In  the  behavioral  sciences,  one  can  use  this  approach, 
but  roust  beware  of  assuming  that  very  precise  judgments  can 
be  made.  For  example,  in  applying  half-normal  plots  to 
screening  designs,  it  is  not  certain  that  the  distribution 
of  effects  (representing  values  from  aliased  sources)  iB 
necessarily  the  same  as  that  of  a full  factorial  with  the 
same  number  of  effects.  Also,  the  guardrails  cited  here 
are  calculated  based  on  the  assumption  that  a specific 
number  of  effects  might  be  real.  Thus,  the  critical  values 
for  plotting  guardrails  can  vary  considerably  depending  on 
the  assumptions  of  the  investigator  (or  the  model  employed 
in  the  calculation) . If  we  do  not  take  these  mathematically 
precise  values  too  seriously,  we  can  make  effective  use  of 
the  half-normal  plots. 

These  plotted  values  are  only  one  of  a number  of 
criteria  to  be  used  for  screening  and  selecting  the  most 
important  variables  for  future  study.  The  half-normal  plots 
provide  a check  on  an  investigator  overenthusiastically 
declaring  effects  to  be  real  when  they  might  have  been  chance. 
Whether  the  probability  of  the  Type  I error  is  precisely  0.40 
or  0.30  is  not  critical  in  this  case.  Used  judiciously  — 
and  we  do  need  more  experience  in  using  them  in  behavioral 
research  — these  half-normal  plotB  can  be  expected  to  be 
extremely  useful  evaluative  tools. 

USING  ORDERED  DISTANCES  WITH  MULTIPLE  RESPONSE  DATA 

Wilk  and  Gnanadesikan  (1961)  propose  a method  of  graphical 
analysis  using  ordered  distances  which  represent  a generaliza- 
tion and  extension  of  half-normal  plotting.  This  will  be  dis- 
cussed later  in  this  report.  Gnanadesikan  (1963)  illustrates 
how  these  techniques  might  be  used.  His  comments  regarding 
the  use  of  these  "internal  comparison  procedures"  are  important 
from  the  point  of  view  of  research  strategy  and  worth  noting 
here.  He  said  (p  22-23) s 
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WWZs  formal  procedures,  with  formal  or  informal 
interpretations,  art  useful  in  their  own  way,  yet, 
ae  anyone  who  ueee  statistics  learns  rapidly, 
they  do  not  satisfy  all  needs,  It  is  neither  ua- 
ual  nor  productive  to  think  that  the  real  insights 
into  data  are  gained  by  posing  a few  auestione  in 
terms  of  a few  parameters  and  by  seeking  for  their 
answers  through  the  use  of  certain  formal  tech- 
niques, Statistical  procedures,  with  or  without  a 
formal  probabilistic  frmework,  which  are  aids,  in 
a sense,  to  w allowing  the  data  to  analyse  them - 
selves " are  valuable  tools  in  gaintng  insights 
into  the  structure  of  data  .... 

Informal  procedures,  with  their  chief  purpose  of 
serving  ae  aids  to  learning  from  data  and , in  a 
sense,  unhampered  by  considerations  of  probability 
statements,  should  guide  and  stimulate  the  experi- 
menter into  partitioning  the  data,  and  studying  the 
partitions  separately,  both  with  respect  to  the 
treatment  structure  and  with  respect  to  the  response 
structure  in  the  problem.  Also , informal  procedures 
should  depend  on  prior  as  well  as  posterior  (after 
seeing  the  data ) considerations  and  Judgment, 


Perhaps  the  main  advantage  of  a tool  such  as  half-normal 
plotting  is  that  it  encourages  the  investigator  to  leave  his 
computer  outputs  and  immerse  himself  in  his  data. 


VALIDATION  TEST 


Wilburn  (1963,  p 23)  proposes  a validation  test  on  the 
final  selection  of  critical  factors  (and  noteworthy  interactions) 
to  ascertain  that  no  large  distortions  occurred  in  the  actual 
responses  that  could  have  seriously  altered  the  mean  effects. 

He  writes;  "The  procedure  used  was  to  determine  the  standard 
error  of  the  individual  observed  responses  by  analyzing  the 
thirty-one  mean  effects.  A second  standard  error,  for  the 
difference  between  observed  and  predicted  responses,  was  then 
obtained  with  the  predicted  responses  based  on  the  assumptions 
that  all  mean  effects  other  than  those  for  [the  critical 
factors]  were  indeed  zero.  If  the  two  standard  errors  would 


then  be  equivalent , both  the  total  experiment  and  the  conclu- 
sions derived  from  it  would  be  proved  valid." 


The  first  standard  error  is  estimated  by  ordering  all  of 
the  effeots  of  the  sources  of  variance  judged  to  be  non- 
crltlcal  and  using  the  value  at  the  rank  position  R for  which 
P'  is  most  nearly  0.683,  obtained  from  the  equation; 


(R  «*■=•  0.683  (N-l)  + .5 


This  would  mean,  for  example,  that  the  effect  at  rank  16 
would  serve  as  a rough  estimate  of  the  standard  error  of 
23  sources,  all  considered  non-critical.* 


0.683  (23  - 1)  +0.5  * 16] 


The  second  standard  error  is  calculated  as  follows. 
First,  do  a reverse  Yates'  algorithm  computation  on  the  cal- 
culated effects  after  making  the  effects  of  all  non-critical 
sources  equal  to  zero.  The  answers  so  obtained  are  the 
"predicted"  responses.  Seaond,  subtract  the  predicted 
response  from  the  actual,  observed  response  for  each  condi- 
tion. Third,  rank  order  these  differences  including  signs. 
Fourth,  plot  them  on  normal  probability  paper 
[P  * (R  - . 05)/ (N-l) ] including  signs.  The  difference  value 
scale  is  along  the  ordinate;  the  probability  (P)  value  scale 
is  along  the  abscissa.  Fifth,  draw  a line  through  the  plots 
approximating  the  least  squares  fit.  Sixth,  determine  the 
vertical  distance  between  the  .50  and  the  .84  P values. 


The  rank  nearest  to  P =0,683  for  all  cases  of  n from  63 
down  to  four  are  given  in  Appendix  V, 
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This  distance  raad  from  the  ordinate  acale  represents  the 
atandard  error  of  theae  differencea  between  predicted  and 
obaerved  response*.  If  the  two  atandard  errora  are  essentially 
equivalent,  this  is  sufficient,  Wilburn  claims,  to  accept  the 
experiment  aa  being  valid.  (Note:  Obviously,  it  is  " valid" 
only  insofar  aa  the  mathematics  ia  concerned.  Validity  of 
simulation,  representativeness  of  the  subjects  and  task,  and 
other  features  determine  ultimate  validity) . 

NUMERICAL  EXAMPLE  OF  A SCREENING  STUDY  ANALYSIS 

An  experiment  was  performed  at  the  U.  S.  Naval  Weapons 
Station,  China  Lake,  California  that  may  represent  the  first 
attempt  on  the  part  of  engineering  psychologists  to  employ 
a saturated  fractional  factorial  and  foldover  deaign  for 
screening  purposes.  (Grossman  and  Whitehurst,  1976).  In 
this  study  the  effects  of  eleven  factors  on  the  location  and 
identification  of  targets  in  a simulated  terrain  model  war » 
investigated  to  ascertain  their  relative  importance  in  that 
task  and  to  generate  curves  to  indicate  how  performance 
varied  as  a function  of  the  more  important  effects. 

The  eleven  factors  that  were  investigated  are  listed  in 
Table  10.  These  factors  could  be  divided  into  three  classes 
depending  on  whether  they  were  subject,  time,  or  environment 
related.  How  the  investigators  handled  the  subject~related 
factors  within  the  experimental  design  was  discussed  earlier 
in  the  section  on  the  design  of  screening  experiments.  While 
the  investigators  were  primarily  interested  in  the  effects  of 
the  single  factor,  Visual  Acuity,  on  target  acquisition,  the 
use  of  this  multifactor  plan  illustrates  how  a much  more  gen- 
eral! sable  data  base  can  be  achieved  with  this  approach  than 
had  acuity  been  studied  alone. 
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TABLE  10 


THE  ELEVEN  FACTORS  AND  THE  TWO  LEVELS  ASSIGNED  EACH  FACTOR 


Level 8 

Factors 

+ 

A. 

Visual  Acuity 

20/40 

20/20 

B. 

Depth  Perception 

Poor 

Good 

C. 

Color  Vision 

Deficient 

Good 

D. 

Experience 

2 Trials 

14  Trials 

E. 

Slant  Range 

1600  m 

800  m 

F. 

Target  Type 

APC 

Tank 

G. 

Masking 

50-75% 

None 

H. 

T/B  Contrast 

1.15 

2.40 

I. 

Pattern  Painting 

Pattern 

Sclid 

J. 

Target  Orientation 

45  deg 

90  deg 

K. 

Target  Density 

1 Target 

3 Targets 

The  2at„  experimental  design  was  constructed  from  two 
11-7  AV 

« II;[  basic  and  foldover  blocks  (Simon,  1973) . This  design 
was  made  up  of  32  experimental  conditions  and  was  capable  of 
estimating  eleven  main  effects,  fifteen  strings  of  two-factor 
interactions,  four  strings  of  three-factor  interactions 
(other  than  those  confounded  with  main  effects) , and  a block 
effect.  Four  measures  taken  on  each  experimental  condition 
were  combined  into  a proportion-of-targets-found  score.  No 
effort  was  made  to  minimize  or  control  sequence  effects.  The 
experimental  conditions  and  the  performance  scores  are  shown 
in  Table  11. 

Analyses  of  the  31  sources  of  variance  are  shown  in 
Table  12.*  In  it  are  given  the  Effects,  the  Variance,  and 


* 

These  are  not  the  analyses  found  in  the  Navy  report,  which 
left  much  to  be  desired  in  this  regard.  The  analyses  and  con- 
clusions in  this  report  are  solely  those  of  this  author. 
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TABLE  11 


I 

9 


EXPERIMENTAL  CONDITIONS  AND  PERFORMANCE 
SCORES  FOR  NAVAL  WEAPONS  CENTER  STUDX 


Block  I (Basic  Block  II  (Foldover  2jJ“7) 


1 

•jk 

.250 

17 

abcdfghi 

1.000 

2 

afhi 

.625 

18 

bcdegjk 

.750 

j? 

3 

bfghk 

.125 

19 

acdsij 

.625 

4 

absgij 

.750 

20 

cdfhk 

.125 

1 

5 

cfgij 

.250 

21 

abdthk 

.875 

6 

acsghk 

.750 

22 

bdfij 

0 

| 

7 

bcshi 

.250 

23 

adfgjk 

.875 

8 

abcf jk 

.625 

24 

daghi 

.625 

1 

9 

dghijk 

.875 

25 

abcaf 

.750 

10 

adafg 

1.000 

26 

bchij 

.375 

11 

bdsfik 

.875 

27 

acghj 

.500 

P 

12 

abdhj 

0 

28 

cafgik 

.875 

13 

cdefhj 

.625 

29 

abgik 

.375 

14 

acdik 

.250 

30 

bafghi 

.750 

; 

15 

bcdg 

0 

31 

aafhijk 

.625 

16 

abcdsfghijk 

1.000 

32 

(1) 

0 

Ovdar  of  Effects  Across  Design  Matrix  in  Block  I* 


New 

Scraaning 

Label 

A 

B 

C 

D 

E 

F 

G 

H 

I 

J 

K 

Stri 

Lngs 

(AD) 

(BC) 

(BD) 

(CD) 

Original 

Factorial 

Label 

A 

B 

C 

D 

MJCD 

ABC 

BCD 

ABD 

ACD 

AB 

AC 

AD 

BC 

BD 

CD 

♦Block  II  is  fold-over  form.  (Sea  Simon*  1973) 
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•ta  squared  for  aach  source  along  with  the  Cumulative  Propor- 
tion of  Variance  Accounted  For. 

In  the  table,  the  eourcea  have  been  ranked  from  the 
largeet  to  the  amalleat  effects.  From  that  data,  half-normal 
plote  are  supplied  for  this  experiment  (Figure  7-A)  and  for  a 
second  experiment  (Figure  7-B)  that  was  a repetition  of  the 
first  but  with  different  subjects.  Ho  other  data  is  given 
here  for  the  second  experiment. 

No  detailed  discussion  of  these  results  will  be  given 
here  except  to  note  that  from  an  examination  of  all  the  data, 

it  appears  that  at  least  four  or  five  factors  (E,A,G,F,  and 

possibly  K)  out  of  the  eleven  appear  to  be  critical.  One 
three-factor  interaction  showed  a large  effect  and  a cursory 
examination  revealed  that  out  of  the  triple  interactions  in 
the  string,  one  was  Interaction  ABF.  Also  the  string  of  two- 
factor  interactions  showing  the  largest  effect  included 
Interaction  AF.  Since  these  both  include  the  factors  showing 
large  (even  larger)  main  effects,  it  suggests  that  both  might 

be  ordinal,  and  would  not  change  the  decision  regarding  the 

criticalness  of  any  factor.  From  the  half-normal  plots 
(Figure  7) , the  only  real  difference  between  the  results  of 
experiments  A and  B is  the  increased  importance  of  Factor  K 
(target  density) . 
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No  effort  was  made  to  discover  why  Factor  K took  on  im- 
portance (i.e.,  eta  squared  * .139)  for  the  second  group, 
whether  it  was  subject-by-factor  interaction  effect  or  the 
result  of  some  unsystematic  disturbance  to  the  data.  Factor 
D,  on  the  other  hand,  was  not  considered  a critical  factor 
within  the  limits  of  the  Experience  (i.e.,  familiarity  with 
the  terrain)  levels  in  this  etudy,  for  three  reasons:  1)  it 
does  not  show  up  as  a better- than-chance  effect  on  either 
half-normal  plocj  2)  its  effect  is  trivial  in  the  second 
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Figure  7.  Half-normal  Plots  on  Results  of  Naval 
Weapons  Center  Experiment. 


t 


$ 


f 


€ 


II 


e 


Tj1' 


el 


experiment  (i.e.,  eta  squared  « .002);  and  3)  in  the  design 
used  for  that  experiment , the  D effect  could  be  severely 
confounded  with  a quadratic  trend  effect  (i.e.,  71%)  if  one 
exists.  No  center  points  were  included  in  the  original  ex- 
perimental design  which  might  have  provided  a measure  of 
trend  through  the  data,  as  well  as  the  basic  for  a test  for 
lack  of  fit  of  the  linear  model  of  the  screening  design. 

The  investigators  at  the  Naval  Weapons  Center  ran 
another  study  using  factors  A,D,E,  G*  in  a 2*x4*  factorial 
design  and  did  an  analysis  of  variance  on  the  data.  All 
factors  but  D were  statistically  significant  at  better  than 
p - .001,  while  the  F for  Factor  D was  less  than  1.  The  four 
factors  plus  several  of  their  interactions  accounted  for  .86 
of  the  variance  in  that  experiment,  suggesting  that  the 
screening  study  was  successfully  picking  important  factors. 
Two-hundred  fifty-six  observations  were  required  for  this 
factorial  study,  and  although  functions  were  approximated 
through  the  mean  data  points  for  several  pairs  of  factors,  no 
overall  function  was  calculated.  Considerably  more  informa- 
tion in  more  useful  form  might  have  been  obtained  more  cheaply 
had  the  original  screening  study  been  augmented  with  addi- 
tional data  points  to  create  a central-composite  design  to  be 
analyzed  by  a regression  analysis. 
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*These  letters  refer  to  the  factors  as  labeled  in 
Table  10. 
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the  investigator  will  anelyxe  the  subject  data  by  averaging 
each  effect  across  subjects*  Even  when  subject  variance  is 


isolated  in  these  experiments#  sub ject-by- factor  interactions 
are  usually  included  in  the  eotimate  of  the  "error"  variance. 
This  so-called  error  variance  then  is  used  to  test  the  stat- 


K 


istical  significance  of  the  estimated  experimental  effects. 
Of  all  the  uses  of  subject  replication#  this  most  common  use 
is  probably  the  least  informative. 


When  subjects  are  used  in  an  experiment  for  replication 
purposes  (which  implies  no  interest  in  critical  subject 
characteristics  insofar  as  the  replication  group  is  concerned; 
the  groups  are  presumed  to  be  homogeneous) , two  kinds  of 
analysis  can  be  performed  that  will  be  considerably  more  in- 
formative than  a test  of  statistical  significance.  In  the 
early  stages  of  the  research  program#  the  screening  stage# 
where  economy  is  being  emphasized  and  little  replication  is 
anticipated#  each  subject  as  a replication  who  is  added 
should  represent  a separate  verification  study.  Each  individ- 
ual's data  should  be  independently  analyzed  and  the  results 


This  designation  is  used  to  distinguish  this  use  of 
subjects  from  the  case  in  which  subjects  are  introduced  into 
the  experiment  to  represent  specific  combinations  of  subject 
characteristics.  We  expect  subjects  as  factors  to  show  a 
difference#  or  at  least#  would  not  be  surprised  if  they  do. 

On  the  other  hand#  we  "hope"  that  subjects  as  replications 
will  not  differ  in  their  performances,  but  would  be 
interested  to  know  if  they  do. 

#* 

In  a survey  of  239  experiments  published  in  the  Human 
Factors  Journal,  the  median  number  of  subjects  as  replications 
was  nine  (Simon#  1976b#  p 27). 
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compared  among  aubjeota.  In  this  way  differences  due  to  a 
bad  measure  or  to  important  sub ject-by-f actor  interactions 
can  be  detected  rather  than  hidden  among  the  averages.  At 
the  end  of  an  experimental  program,  the  data  from  subjects 
as  replications  would  be  used  to  establish  confidence  limits, 
whiah  from  an  operational  point  of  view  is  far  more  useful 
information  than  a test  of  statistical  significance. 

ESTIMATING  CONFIDENCE  LIMITS 

Cochran  and  Cox  (1957,  p 5)  have  this  to  say  about  signi- 
ficance tests  and  confidence  limits)  "...  testa  of  signifi- 
cance are  less  frequently  useful  in  experimental  work  than 
confidence  limits.  In  many  experiments  it  seems  obvious 
that  the  different  treatments  must  have  produced  some  dif- 
ference, however  small,  in  effect.  Thus  the  hypothesis  that 
there  is  no  difference  is  unrealistic)  the  real  problem  is 
to  obtain  estimates  of  the  sizes  of  the  differences.  The 
construction  of  confidence  limits  may  add  something  to  the 
interpretation  of  a test  of  significance."  They  note  that  if 
the  difference  between  performances  on  two  machines  is  not 
found  to  be  statistically  significant,  it  does  not  prove 
that  the  performances  (and  thus  the  machines)  are  identical. 
They  argue  that  if  the  95%  confidence  limits  for  the  dif- 
ferences in  performance  were  relatively  small,  then  the  true 
differences  would  probably  be  of  no  practical  significance, 
and  "...  consequently,  it  could  be  said  that  for  all 
practical  purposes  the  2 machines  are  identical  in  speed. 
This  is  much  more  positive  and  useful  than  the  mere 
statement  that  the  difference  in  speeds  was  not  statis- 
tically significant."  Conversely,  they  add,  if  the  confi- 
dence limits  are  large,  then  ”...  there  is  no  justification 
for  the  conclusion  that  the  machines  can  be  regarded  as 
equivalent.  All  that  we  have  learned  is  that  the  data  are 
not  sufficiently  accurate  to  show  whether  there  is  a 
difference  in  speed  that  is  of  practical  importance." 
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In  problems  of  equipment  design,  valid  confidence  limits 
are  of  considerable  operational  importance.  While  mean 
performance  is  useful  to  know,  knowing  the  limits  — i.e.,  the 
estimated  performance  of  the  95th  or  the  5th  percentile  man  — 
may  be  even  more  important  from  the  standpoint  of  safety 
and/or  mission  success. 


Confidence  limits  can  be  estimated  with  the  following 
equation: 


t s 

100  (1  - a)%  Confidence  Limits  - Mean  t — ~ 

n 


Where t t is  the  Student  t for  n-1  degrees  of  freedom  at 
the  error  level 

a is  the  probability  of  Type  I error  the  investigator 
is  willing  to  risk 

n is  the  number  of  observations  on  the  condition 


s is  the  standard  deviation  of  the  replications 


INTERPRETING  MULTI -SUBJECT  DATA 


Subjects  as  replications  should  not  be  averaged  together 
until  it  has  been  established  that  they  are  in  fact  homogen- 
eous, at  which  time  averaging  becomes  a cleaner  way  of 
handling  the  data  although  a less  informative  one. 

When  subjects  are  used  as  replications,  a complete  analysis 
should  be  performed  on  the  data  from  each  one  separately  and 
the  results  compared.  A number  of  possible  outcomes  may  be 
anticipated,  each  with  its  own  particular  interpretation. 

For  example: 

1.  The  rank  order  of  the  different  sources  of  var- 
iances (based  on  the  magnitude  of  their  effects) 
is  essentially  the  same  for  all  subjects. 


Ill 
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2.  A few  sources  are  consistently  ranked  first  for 
all  subjects  but  after  that  there  is  little 
agreement . 

3.  The  ranks  agree  among  some  subjects  but  not 
among  others. 

4.  There  is  essentially  no  agreement  in  the  ranks  of 
the  sources  among  subjects. 


If  the  overall  ranking  of  a majority  of  factors  in  a 
screening  study  agrees  across  subjects,  there  is  reason  for 
confidence  that  the  results  are  probably  accurate.  It  can 
be  argued,  of  course,  that  just  because  two  or  three  subjects 
agree  that  is  no  reason  to  believe  that  the  results  from  15 
to  20  subjects  would  also  agree.  A sample  of  three,  the 
argument  goes,  is  just  too  small.  It  could,  of  course,  be 
argued  that  in  a population  of  thousands,  15  or  20  subjects 
are  also  a rather  small  number.  However,  it  should  not  be 
forgotten  that  the  purpose  of  this  strategy  is  to  check  for 
gross  errors  and  to  do  so  as  economically  as  possible.  If, 
in  fact,  neither  time  nor  economy  are  major  considerations, 
then  one  might  run  the  thousands  of  subjects.  This  still  would 
not  deny  the  importance  of  examining  the  results  from  each 
one  at  a time  to  find  discrepancies.  One  strategy  to  increase 
one's  confidence  in  the  data  from  a few  subjects  is  to 
select  the  few  subjects  at  opposite  extremes  of  skill  or 
experience,  for  example,  to  test  the  limits.  But  when  the 
agreemen.  . j good,  for  a screening  study,  only  a few  subjects 
(and  a competent  investigator)  will  ordinarily  suffice. 


* if 


As  the  differences  in  rank  become  more  evident,  more 
subjects  may  be  required  to  understand  why  this  is  so. 
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If  there  ia  essentially  no  agreement  in  the  source-ranks 
among  subjects,  it  may  be  due  tot 


1.  The  collection  of  analysis  of  the  data  was  sloppy 
with  either  considerable  measurement  or  observa- 
tion errors. 


2.  The  performance  measure  may  not  be  relevant  to 
the  problem  or  the  task. 

3.  The  factors  actually  have  trivial  or  no  effects 
on  performance. 

4.  The  task  is  either  too  difficult  or  too  easy  and 
little  differentiation  in  performance  is  occurring. 

When  a few  factors  consistently  rank  first  among  subjects, 
but  the  remainder  fail  to  agree,  it  is  likely  that  those  not 
agreeing  are  non-critical  sources  of  variance  and  therefore 
show  a variability  both  within  and  between  subjects  due  to 
chance.  The  magnitude,  as  well  as  the  ranko,  will  help  de- 
termine if  this  interpretation  is  correct. 

When  the  source-ranks  among  some  subjects  agree  and 
disagree  among  others,  several  explanations  are  possible. 

For  examplet 

1,  If  the  results  show  several  groups  of  subjects 
consistent  within  but  not  between  groups,  then 
it  suggests  that  there  may  be  unidentified  subject 
factors  interacting  with  the  other  factors.  This 
is  an  important  finding  and  should  be  investigated 
further. 
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2.  If  there  it  some  consistency  in  the  source-ranks 
among  some  subjects  and  no  consistency  observed 
among  some  others,  this  may  mean  that: 

a.  The  inconsistent  subjects  were  doing  so 
poorly  that  nothing  really  mattered, 

b.  There  were  data-collection  errors  among 
the  inconsistent. 

c.  The  inconsistent  subjects  had  not  stabilized 
their  procedures  before  beginning  the  experi- 
ment and  either  changed  their  approach  to  the 
task  in  mid-study  or  exhibited  learning  (or 
fatigue)  effects  that  distorted  the  experi- 
mental effects. 

d.  The  inconsistent  subjects  were  tested  across 
conditions  in  a different  order  and  unisolated 
sequence  effects  might  be  distorting  experi- 
mental effects. 

Inspection  of  the  raw  data  will  often  help  find  the  explana- 
tion. 


These  are  only  a few  possibilities.  Only  by  inspecting 
the  raw  data  before  it  is  aggregated  can  an  investigator 
begin  to  have  faith  in  his  results,  particularly  when  the 
amount  of  data  is  small.  Certainly  when  inconsistencies  are 
observed,  they  should  not  be  hidden  by  averaging  on  the 
assumption  that  this  is  a cleansing  process.  It  is  not. 
Averaging  at  the  screening  phase  may  hide  important  effects 
or  the  fact  that  the  data  is  poor.  Interpreting  averaged 
results  may  lead  to  a distortion  of  the  truth. 
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VII.  ADJUSTING  EXPERIMENTAL  EFFECTS  POR  TRENDS 


Although  the  screening  designs  proposed  in  this  report 
are  robust  to  trends,  when  any  overlap  with  a trend  effect 
might  distort  the  data  more  than  is  deemed  incidental,  the 
investigator  may  wish  to  adjust  statistically  the  experi- 
mental effects  for  trends.  An  examination  of  the  Percent 
Overlap  data  at  the  bottom  of  design  matrices  for  the  16, 

32,  and  64  factors  (Table  1,  Appendices  II  and  III, respectively) 
show  which  effects  require  adjustment.  Even  if  the  investi- 
gator has  used  procedures  that  are  likely  to  minimize  any 
trend  effects,  he  may  still  wish  to  adjust  as  a precaution. 

It  is  apparent  from  the  tables  that  those  effects  which  must 
be  adjusted  for  linear  and  cubic  trends  need  not  be  adjusted 
for  quadratic,  and  vice  versa. 


The  methods  of  adjustment  described  here  were  taken  from 
a paper  by  Daniel  and  Wilcoxin  (1966)*.  They  applied  the 
technique  only  to  linear  and  quadratic  trends.  Methods  for 
adjusting  for  cubic  trends  are  also  included  in  this  report. 
When  linear  and  cubic  trends  are  both  confounded  with  an 
effect,  both  must  be  adjusted  simultaneously. 


Those  who  wish  to  refer  to  the  original  paper  by  Daniel 
and  Wilcoxin  (1966)  to  learn  how  the  equations  for  the  cor- 
rection values  are  derived,  will  find  the  following  pages  in 
that  paper  the  most  informative.  The  general  equation  for 
deriving  the  correction  factor  for  linear  or  quadratic  trends 
is  (4.10)  on  page  273;  no  equation  was  provided  for  calculating 
the  cubic  trend.  The  (L)  term  [or  (Q)  term]  in  that  equation 
can  be  calculated  from  the  sequence  of  identities  (4.1)  and 
(4,2)  shown  on  page  269.  It  may  also  be  calculated  as  the 
sum  of  the  cross  products  between  the  particular  integer 
Tchebycheff  orthogonal  polynomial  coefficients  and  perfor- 
mance. Equations  (4.7),  (4.8)  and  so  forth  on  page  272  are  to 
be  used  to  correct  the  appropriate  estimated  effects  for 
trend.  In  Appendix  VI  of  this  report  the  derivations  are 
given  for  the  equations  needed  to  adjust  for  both  linear  and 
a cubic  trend  together. 
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CONSTRUCTING  AN  ARTIFICIAL  PROBLEM  WITH  TREND  EFFECTS 

To  illustrate  how  correction*  for  trend*  are  calculated 
and  used,  artificial  data  generated  for  a two- factor  experi- 
ment, replioated  twice,  will  be  used*  There  are,  therefore, 
eight  observation*  and  three  effects,  A,  B,  and  AB.  When 
the  experimental  condition*  are  arranged  in  the  Standard 
Order,  i.e.,  (1) ,a,b,ab, (1) ,a,b,ab,  the  performance  scores, 
unbiased  by  trend  effects,  are; 

-7,  +1,  -3,  +9,  -7,  +1,  -3,  +9 

respectively.  These  yield  a mean  performance  of  sero,  re- 
gression coefficients  of  5,  3,  and  1 for  the  effects  A,  B, 
and  AB^  respectively,  and  no  error.  The  equation  formed  from 
that  data  ist 


Y « 5A  + 3B  + 1AB 

To  introduce  trend  effects  into  the  data,  linear, 
quadratic,  and  cubic  coefficients  of  the  integer  Tchebychefl' 
orthogonal  polynomial  (Fisher  and  Yates,  1963;  Beyer,  1966; 
DeLury,  1950)  were  multiplied  by  a factor  of  -4,  2,  and  1, 
respectively,  and  added  to  the  experimental  performance  data. 
The  total  design,  with  supplemental  data  to  illustrate  how 
the  performance  data  was  produced,  along  with  other  calcu- 
lations to  be  used  later  to  adjust  for  trend  effects,  is 
shown  in  Table  13.  The  differences  between  the  trend-free 
and  trend-biased  effects  in  this  example  are  shown  in  Figure  8. 

The  trend-free  and  trend-biased  performance  values  from 
Table  13  can  be  analysed  using  Yates'  algorithm  to  esti- 
mate the  effects  of  A,  B,  and  AB.  These  analyses  are  shown 


in  Table  14.  A comparison  of  the  coefficients  of  the  three 
effects  before  and  after  they  have  been  biased  by  trends 
reveals i 


Effect 


Trend-free 


Trend-biased 


The  differences  are  striking.  In  addition,  no  replication 
or  replication  interaction  effects  are  indicated  with  the 
trend-free  data  but  (as  can  be  seen  in  Table  14-B)  both 
show  large  effects  in  the  trend-bias  data.  This  would 
ordinarily  be  delegated  to  an  error  variance. 


In  practice,  were  this  a real  experiment,  the  investi- 
gator would  have  no  idea  what  the  true  trend- free  results 
should  be.  After  all,  the  purpose  of  his  experiment  is  to 
discover  that  from  the  sample  data.  All  he  knows  are  the 
performance  scores  and  the  results  of  their  analysis.  If 
he  has  no  way  of  measuring  the  trend  effects,  or  for  that 
matter,  even  know  for  a fact  that  they  exist,  there  is  the 
real  and  ever-present  danger  that  the  only  information  he 
has  will  be  distorted  as  in  this  example.  Neither  he  nor 
his  publi  if  he  publishes,  can  know  for  sure.  Eventually 
this  distorted  data  becomes  part  of  the  lore  naively  referred 
to  by  some  as  "scientific"  fact.  It  is  not  necessarily  true, 
as  some  defenders  of  poor  experimentation  like  to  claim, 
that  some  information  (however  poor)  is  better  than  no 
information.  When  poor  information  can  lead  to  erroneous 
decisions,  it  is  better  to  have  no  information. 


To  offset  these  possibilities,  the  conscientious  ex- 
perimenter should  first  use  procedures  that  help  reduce  or 
eliminate  unwanted  trend  effects.  Next,  he  should  assign 
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TABLE  14 

ESTIMATING  EXPERIMENTAL  EFFECTS  IN  THE 
jSENCB  AMD  PRESENCE  OF  TREND  EFFECTS 


3 Unbiased 


Exptl. 

Cond. 

Perform. 

1 

2 

Ef fect- 
TotAlS 

COAff . 
48 

Source 

(1) 

-7 

-6 

0 

0 

0 

A 

4-1 

+6 

0 

40 

5 

A 

b 

-3 

-6 

20 

24 

3 

B 

Ab 

+9 

+6 

20 

8 

1 

AB 

(1) 

-7 

4-8 

12 

0 

0 

A 

+1 

4-12 

12 

0 

0 

Error 

b 

-3 

4-8 

4 

0 

6 

Ab 

+9 

4-12 

4 

0 

0 

(14-A) 

EXAMPLE  OF 

ANALYSIS 

OF  DATA  WITHOUT  TREND  BIAS 

Exptl. 

Cond. 

Perform. 

1 

2 

3 

Effect- 

Totals 

Biased 

Coeff . 

48  Source  Correction 

Unbiased 

Coeff. 

(1) 

28 

56 

72 

0 

0 

0 

A 

28 

16 

-72 

24 

3 

A 

- L-2C  - 

5 

b 

10 

-48 

-4 

-16 

-2 

B 

-2L-3C  « 

3 

Ab 

6 

-24 

28 

24 

3 

AB 

- Q_ 

1 

(1) 

-24 

0 

-40 

-144 

-18 

C Repl-i 

0 

A 

-24 

-4 

24 

32 

4 

AC  I 

0 

b 

-26 

6 

-4 

64 

8 

BC  1 

’larror;  — ■ 

0 

eb 

2 

28 

28 

32 

4 

ABC  J 

0 

(14-B)  EXAMPLE  OF  ANALYSIS  OF  DATA  WITH  TREND  BIAS, 
SHOWING  HOW  BIAS  IS  CORRECTED 


hi«  roost  important  factors  to  trend-free  columns.  Then,  as 
a final  precaution#  he  should  adjust  the  estimated  effects 
for  whatever  trend  remains.  The  method  supplied  by  Daniel 
and  Wilooxin  (1966)  and  supplemented  by  Webb  (1977)  is 
described  next. 


DETERMINING  THE  VALUE  OP  TREND-ADJUSTMENT  FACTORS 


If  the  investigator  wishes  to  correct  for  linear  or 


quadratic  or  cubic  trend  effects  alone,  each  of  these  can 
be  calculated  independently  of  one  another  using  equations 
I or  II  or  III,  respectively,  in  Table  15.  If  he  wishes  also 
to  adjust  for  cubic' trends  as  well,  when  they  are  correlated 
with  linear  trends,  the  adjustment  factors  for  the  two 
trends  must  be  calculated  simultaneously  using  the  pair  of 
equations,  IV  a and  b in  Table  15.  The  information  required 
to  solve  these  equations  will  be  found  in  Appendices  I,  II 
and  III  as  well  as  the  equivalents  of  Table  14-B  for  new 
problems.  In  the  discussion  that  follows,  the  artificial 
trend-biased  data  from  the  functional  design  described  in 
the  previous  section  will  be  adjusted  for  trendc.  How  to 
perform  the  analysis  when  a screening  design  rather  than  a 
factorial  design  is  involved  will  be  discussed  later. 


Linear  Adjustment  Factor  L 


Equation  I in  Table  IS  is  needed  to  calculate  the  linear  j J 
adjustment  factor.  The  numerical  substitutions  for  symbolic  j | 
values  in  this  example  are  shown  below:  i | 


,^1'ryyy; 


TABU!  15 

EQUATION!  NEEDED  TO  flND  TUB  LINEAR,  QUADRATIC,  ANO  LINEAR/CUBIC  CORRECTION  VALUE! 


I.  Equation  to  Poterminc  tha  t.tna«r  Correction  Vilut  (L) 

tN(LL)-(U^-(L»>2-tLt)J...3  L - »<»)  - {LX)  (X)  - ILYHY)  » (LR>  (*)... 

II.  Equation  to  Determine  the  Quadratic  Corractlcn  Valua  (o) 

tN(QQ>-(QX),-(0Y)I-{QI)l...3  Q - N(QP)  - (OX)  (X)  - (QY) (Y>  - (QZ)  {!)... 
Ill*  Equation  to  Datermlna  th«  Cubic  Correction  Value  (K) 

CHIRK)-  <KX)2-(KY)J-(KI)J..,3  X - H{XP)  - (XX)(X>  - (KY)  (Y)  - (KB)  (!) . . 


IV.  iimultanaoua  Eguationa  for  Datarmlnlng  Linear  (L)plua  Cubic  (X)  Correction  Valuea 
•>  [N(LL)“ (LX)2- (LY)2-. . . ] L - [ (LX)  (XX)  * (LY) (KY)  ♦...)  X » H(tP)  - (LX)  (X)  - (LX)  (Y) 
b)  -[  (LX) <KX)  + (LY) (KY)*...]  t ♦ (N (XX) -(XX) 2 • (KY)2-...]  K ■ N(Xf)-  (XX) (X)  - (KY)  (Y) 


8YMBOLOGY  TOR  TABLE  15 


N " Total  number  of  observations,  rJ  , whore  p may  be  any  value  from  0 up 

to  (k-1),  and  r in  tha  number  of  times  design  is  replicated, 

L,Q,  or  X « Ordered  Integor  Tchebycheff  orthogonal  polynomial  coefficients  for  linear, 

quadratic,  or  cubic  trends,  respectively.  (Found  in  Fisher  ! Yates,  1963( 
.Beyer,  196Gi  Rebury , 1950). 

LL,QQ,  or  KK  « Sum  of  squared  L,  Q,  or  X Tchebycheff  coefficients,  respectively. 

P * Performance  values  (as  found  in  Table  1*,  second  column). 

LP,QP,  or  KP  ■ Stun  of  cross  products  between  Tchebycheff  coefficients  for  a specific 

trend  (L,  Q,  or  X,  respectively)  and  the  corresponding  performance  values 
for  the  ordered  experimental  conditions.  ‘ 1 

X,Y,Z,  etc  ■ Ordered  experimental  conditions  (*1)  for  Effects  X,Y,2  etc  (as  found  in 

experimental  design).  (Number  of  effects  involved  depends  on  how  many 
are  correlated  with  particular  trends  bei-ig  corrected.) 

LX , LY , 1,2 , etc  or 

OX.QY.C?,,  otc  or 

XX,KY,K2,  etc  « Sum  of  cross  products  (colled  "inter  products")  between  Tchebycheff 
coefficients  for  a specific  trend  (L.Q,  or  K,  respectively)  and  the 
ordered  experimental  conditions  (t.  1)  for  Effects  X,Y,Z  etc  (depending 
on  how  many  are  correlated  with  the  particular  trend  being  corrected). 

- (Inner  products  for  the  designs  in  this  report  can  be  found  in 
Appendix  I-C,  II-D,  and  IXI-D.) 

(X),(Y),(Z),  etc  « Effect-totals  for  Effects  X,Y,Z  etc  (depending  on  how  many  are 

correlated  with  particular  trend  being  corrected).  (Effect-totals  are 
found  in  the  last  column  of  Yates'  analysis,  before  dividing  by  N,  e.g. 
as  illustrated  in  Table  14. V 


L,  Q,  or  X 


The  unknown  trend  (L,  Q,  or  X,  respectively)  correction  value  to  be 
determined. 


o 


I 


ti 


Note  that  while  A,  B,  and  AB  are  the  only  real  experimental 
effects  in  this  example,  with  eight  observations,  in  theory, 
all  effects  of  a 2s  factorial  can  be  estimated.  For  example, 
in  Table  14-B  we  see  that  the  effect-totals  of  the  imagin- 
ary factor  C is  -144.  In  fact,  this  C represents  a block 
effect,  the  difference  between  the  two  halves  of  the  repli- 
cated experiment.  At  least  one  of  the  effects  that  is 
correlated  with  a linear  trend  cannot  be  used  as  an  experi- 
mental factor  in  order  to  provide  the  necessary  degree  of 
freedom  for  the  trend  estimate.  Factor  C,  the  block  effect, 
therefore  would  serve  this  purpose,  it  being  the  only 
remaining  source  of  variance  confounded  with  a linear  trend. 


0 
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Instead  of  using  Equation  I,  Table  15,  to  make  the 
calculation  shown  above,  the  adjustment  for  linear  trend 
could  have  been  done  this  way: 

(1024)  L * LC (C) 

(1024)  L * 32  (-144)  * -4608 
L = -4.5 

In  this  calculation  we  used 

[LC(C)3  instead  of  [N(LP)  - (LA) (A)  - (LB) (B) ] 


C since  they  are  equivalent.  The  equation  on  the  right 

removes  from  the  total,  N(LP),  the  (LX) (X)  terms  of  all 
sources  of  variance  that  were  included  as  experimental 
factors  correlated  with  linear  trend  (i.e.,  A and  B) . 

That  would  leave  as  a remainder,  the  value  for  all  sources 
of  variance  that  were  not  included  in  the  experiment  but 
were  correlated  with  linear  trend  (i.e,,  C) , which  is 
what  LC(C)  represents. 


C 
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Quadratic  Adjustment  Factor  Q 

The  calculation*  for  isolating  linear  and  quadratic 
adjustment  factors  are  the  same  except  that  Q-values  are 
substituted  for  L-values,  as  shown  in  Table  15,  Equation 
II.  The  substitution  of  numerical  for  symbolic  values  in 
this  problem  are  shown  below: 

[8(168)  - (8)2]Q  - 8 (+344)  - 8 (+24) 

(1280)Q  - 2752  - 192  - 2560 
Q - +2 

As  was  done  when  estimating  the  linear  trend  adjustment 
factor,  all  the  non-experiment  sources  of  variance  correla- 
ted with  a quadratic  trend  could  have  been  used  to  arrive 
at  the  same  answer.  For  example: 

1280  Q - 16(32)  + 32(64)  - 2560 
Q ■ +2 

Cubic  Adjustment  Factor  K 

This  calculation  would  parallel  the  linear  or  the  quad- 
ratic examples,  except  of  course,  only  the  terms  that  were 
a source  of  variance  in  the  experiment  and  were  correlated 
with  a cubic  trend  would  be  involved.  These  are  shown  in 
Table  15,  Equation  III.  The  calculation  would  be: 

[8(264)  - (16) 2 - (32) * ]K  - 8(416)  - 16(24)  - 24(-16) 

(832)S  - 3328 
K * 4 


: .1 
i I 
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Becauae  the  data  was  generated  artificially , \;e  know 


that  both  the  linear  and  cubic  adjustment  factors  just 
calculated  are  not  correct.  The  linear  one  should  not  be 


-4.5  but  -4,  and  the  cubic  should  not  be  4 but  1.  These  ! I 

discrepancies  occur  because  the  linear  and  cubic  trend  1 

j i* 

effects  are  correlated  with  one  another  and  if  we  intend  to  ! i 


adjust  the  effects  for  both,  then  the  adjustment  factors 
must  be  determined  for  both  simultaneously.  This  means 
that  one  may  correct  for  linear  and/or  quadratic  trend 
effects,  but  that  if  one  were  intending  to  correct  for 
cubic  and  linear,  the  set  of  simultaneous  equations,  IV-A 
and  B in  Table  15  should  be  used  to  determine  the  pair  of 
adjustment  factors.*  The  substitutions  of  numerical  for 
symbolic  values  in  this  problem  are  shown  below : 

a)  8(168)  - (B)a-  (16) a L - 8(16)  + 16(24)  K - 8 (-584) -[8  (24)  + 16(-16)] 

b)  - 8(16)  + 16(24)  L + 8(264)  - <16)2  - (24)aK  + 8(416)  -[l6(24)+24(-16)] 

which  can  be  simplified  to: 

• • 

1024  i.  - 512  K - -4608 

* • 

-512  L + 1280  K - 3328 
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If  we  multiply  the  second  equation  by  two,  we  can  eliminate 

• i 

L,  and  solve  for  Ki 

• * 

1024  L - 512  K - -4608 

t * 

-1024  L + 2560  X « 6656 
2048  X - 2048 

X - 1 


Substituting  this  in  Equation  iv-a, 

1024  L - 512  (1)  - -4608 


we  simplify  and  get 

♦ 

1024  L - -4096 
L - -4 

• • 

These  values  of  L,~ 4,  and  K,l,  are  the  weighting  factors 
that  we  reuned  to  create  the  artificial  data. 

Making  the  Adjustment  for  Trend 

The  equation  for  adjusting  for  any  single  trend  effect  is: 

X - (X)  - TX  (T) 

N 


0 


o 


o 
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where* x le  the  adjusted  effect 

(X)  is  the  effect- total  of  the  source  being  adjusted 
(e*g»,  A , B|  AB , etc) 

t 

T is  the  particular  trend  correction  factor  (e.g., 

L or  Q etc) 

TX  is  the  sum  of  the  cross  products  between  the 

coefficients  of  the  particular  trend  and  the 
source  (e.g.,  LA  or  QA  or  LB,  etc) 

N is  the  number  of  independent  observations  in  the 
experiment 

For  example,  in  our  fictitious  data  in  Table  14-B,  the 
coefficient  for  the  biased  estimate  of  Interaction  AB  is  3. 
To  correct  that  value  for  the  bias  introduced  by  the  quad- 
ratic trend,  we  solve  this  equation* 

AB  - <*■*>■  ^ 

AB  - -3-4--jj- 
AB  » +1 

The  trend-free  estimate  of  the  coefficient  for  Interaction 
AB  is  1. 

To  adjust  un  affeot  for  both  the  linear  and  the  cubic 
trend,  the  general  equation  is* 

x • (X)  - LX (L)  - KX  (X) 

N 

Thus  to  correct  Factor  A for  both  linear  and  cubic  trend 
bias,  we  substitute* 


« 


tl 


I 


I 


?■ 


« 


I 


24  - 8(-4)  - 16  (l)  „ 

8 s 

« 

and  for  Bt 

-X6  - 16(-4)  - 24(1)  , +3 

both  of  which  are  the  coefficients  we  had  derived  before 
trends  had  been  introduced  to  distort  the  data. 

If  one  were  to  apply  these  same  adjustments  to  the 
trend-biased  effect-totals  for  the  sources  of  variance 
associated  with  the  block  (replication)  differences  and 
each  block-by-factor  interaction,  the  corrected  values 
would  all  be  zero  as  they  should  be  in  our  fictitious  data. 

Applying  Trend -Adjustment  Techniques  to  Screening  Designs 

Applying  these  techniques  to  screening  designs  involves 
no  unique  problems  as  long  as  the  analysis  is  done  with  the 
original  factorial  labels  in  mind.  The  results  from  the 
Yates'  algorithm  will  automatically  rank  the  data  in  Standard 
Order  using  the  original  labels.  These  original  labels  are  to 
be  used  as  references  to  find  in  each  screening  design  the 
values  needed  to  make  the  trend  adjustments  for  the  partic- 
ular effect.  After  the  corrections  have  been  made,  the  new 
screening  design  labels  would  be  substituted  for  the 
original  factorial  labels. 


I 
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Human  performance  ia  situation- specific  and  complex. 

To  understand  and  predict  performance#  therefore , it  is 
necessary  to  examine  all  of  the  critical  factors  operating 
at  the  time  performance  is  being  measured  (including  those 
associated  with  antecedent  events  that  also  can  affect  per- 
formance) . Equally  important,  but  more  frequently  ignored, 
is  the  importance  of  providing  measures  that  reflect  the 
complexity  of  performance  ijn  toto. 

Although  methods  of  handling  multiple  performance 
criteria  have  been  around  for  decades,  experimental  psychol- 
ogists in  general,  and  engineering  psychologists  specifically, 
have  tended  to  examine  the  effects  of  experimental  factors  on 
multiple  performance  measures,  a criterion  at  a time.  As 
performance  under  operational  conditions  is  generally  complex, 
this  one-at-a-time  approach  regarding  responses  is  no  more 
acceptable  than  it  is  regarding  stimuli  or  the  task  situation. 
Informative  results  will  be  obtained  only  when  it  becomes 
common  practice  to  perform  bilateral  multivariate  experiments. 

ADVANTAGES  OP  BILATERAL  MULTIVARIATE  EXPERIMENTS 

The  following  are  reasons  why  an  investigator  would  want 
to  include  multiple  responses  as  an  integral  part  of  his 
experimental  plan  and  analysis x 

1.  A single  measure  usually  does  not  adequately 
represent  the  typical  complex  performance  under 
investigation, 

2.  A single  measure  may  be  an  acceptable  unitary 
concept  but  understanding  would  be  improved  if 
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it  were  broken  down  into  component  measures, 
rather  than  tying  it  to  any  single  one. 

Discovering  evidence  of  interaction  among  response 
measures  improves  one's  understanding  of  a phe- 
nomenon. 

An  analysis  of  multiple  effects  jointly  may  lead 
to  different  conclusions  than  would  the  sum  of 
responses  analyzed  individually. 

Understanding  the  joint  contribution  of  several 
response  variables  can  make  it  possible  to  select 
a smaller  but  most  efficient  combination  of 
variables  with  which  to  measure  performance. 

It  is  more  economical  to  carry  out  a single  test 
rather  than  a number  of  separate  tests  for  each 
response  before  a significant  effect  is  detected. 

Multiple  responses  increase  the  generality  of  the 
results. 

A multiple  response  measure  of  performance  in  many 
situations  is  the  more  natural  condition,  whereas 
if  efforts  were  made  to  hold  some  measures  con- 
stant, artificial  restrictions  are  introduced  into 
the  data  to  distort  the  interpretation.  However, 
comparisons  and  assessments  of  factors  and  inter- 
actions when  there  are  multiple  responses  are 
complicated  by  the  fact  that  there  is  no  unique 
linear  ordering  for  vectors.  Different  approaches 
have  been  devised  to  overcome  this. 
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The  independent  variable*  in  screening  designs  are 
orthogonal  (uncorrelated) , However,  it  is  highly  likely 
that  the  dependent  variables  — the  responses,  the  criteria  — 
will  be  correlated  to  some  degree. 


I 
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Once  an  investigator  has  decided  to  make  multiple 
responses  a critical  part  of  his  investigation,  he  must 
then  decide  how  he  should  analyze  his  data.  It  is  not 
always  obvious  — and  in  fact,  it  may  be  counterproductive  — 
to  use  the  most  sophisticated  and  formal  methods  of  analysis 
available. 

A statement  by  Gnanadesikan  (1963,  p 23)  is  appropriate 

heret 


While  the  majority  of  multiresponse  techniques t 
especially  those  in  the  fotmal  framework  of  hy- 
pothesis testing , have  been  thought  of  as 
analogues  of  certain  unireaponse  procedures t 
yet  from  the  standpoint  of  useful  interpreta- 
tions quite  often  these  procedures  are  not  such 
analogues  ....  It  should , therefore , be  em- 
phasised that  a multiresponse  analysis  should  be 
considered  as  supplement  to  and  not  replacement 
for  parallel  uniresponse  analyses.  Methods 
which  stimulate  the  user  to  look  at  subsets  of 
responses , including  the  study  of  several 
responses  individually , are  thus  very  useful. 

In  summary,  the  sophisticated  investigator  avoids  a single 
cookbook  analysis  but  instead  examines  his  data  with  any 
technique  that  is  likely  to  provide  useful  information. 


Scope  of  Thli  Section 

A great  many  papers  and  books  — dating  back  to  the 
mid-1930's  — have  been  written  about  the  methods  for 
analysing  experimental  data  involving  multiple  responses. 

In  this  section,  therefore,  no  attempt  will  be  made  to 
explain  the  derivations  of  these  methods  in  depth,  nor  to 
provide  the  reader  with  more  than  a cursory  — conceptual  — 
description  of  how  to  use  them.  The  purpose  of  this  section 
is  to  alert  the  experimental  psychologist  to  the  advantages 
of  techniques  of  multivariate  analysis  and  to  encourage  him 
to  use  them  as  a normal  part  of  his  experimental  program. 

To  do  this,  some  of  the  more  popular  as  well  as  some  less 
familiar  methods  will  be  described.  In  seme  cases,  enough 
information  will  be  provided,  hopefully,  to  take  some  of 
the  mystery  out  of  less  familiar  statistics,  at  least  enough 
to  make  them  easier  to  understand  when  the  user  must  go  to 
original  papers  to  learn  the  mechanics  of  how  to  use  them. 

Some  simple  methods  of  analyzing  multiple  response 
methods  are  described  because  in  many  cases  they  will  be 
more  responsive  to  an  investigator's  needs  than  one  of  the 
more  sophisticated  analyses.  For  some  of  the  more  complex 
analyses,  recent  innovations  that  facilitate  the  interpre- 
tation of  the  data  will  be  described.  In  some  cases  a 
method  may  be  selected  to  avoid  a large  or  unusual  computer 
effort. 

As  Wilk  and  Gnanadesikan  (1964,  p 613)  wrote: 

. . . there  ia  a long  existemt  need  for  procedures 
to  handle  data  involving  multivariate  responses  in 
such  a way  that  the  resulting  statistical  sumary 
and  analysis  (i)  takes  some  account  of  the  multi- 
variate 8truoturet  and  (ii)  encourages  insight 


132 


into  the  experimental  situation  (ae  distinot  from 
oarrying  out  artificial  and  often  pointless  tests 
of  hypotheses) , The  indefiniteness  and  complexity 
of  objectives  of  statistical  analysis  of  multi- 
response  data  emphasise  the  need  for  general 
informal  procedures  which  help  to  convey  to  the 
data  analyser  some  of  the  information  implicit 
in  the  data , 

Hopefully,  this  section,  while  In  many  respects  meager,  will 
at  least  show  the  reader  that  there  are  choices  to  be  made 
and  provide  enough  detail  to  help  him  make  the  choice. 

WEIGHTED  CRITERIA 

If  the  relative  importance  among  n different  sets  of 
responses  is  known  and  can  be  quantified,  the  investigator 
can  reduce  the  multiplicity  of  responses  to  a single  value 
and  treat  the  data  as  a unilateral  analysis.  For  example, 
if  all  of  the  responses  or  criteria  can  be  associated  with 
a dollar  value,  or  weighted  according  to  their  contribution 
to  some  other  single  concept,  then  they  could  be  combined 
into  a composite  variate,  W. 

Before  the  weights  are  assigned,  however,  each  set  of 
performance  scores  must  be  transformed  into  standard  scores . 
The  standard  score  for  each  set  of  responses  would  be: 

Vi  - i?i 

z * _±_ ± (i  » l to  n sets  of  responses) 

1 

where  y is  the  mean  of  the  particular  set  of  performance 
values.  Each  set  of  performance  measures,  Zj  through  7>n 
would  be  assigned  the  weighted  values  b>  through  bn 
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respectively  and  a composite  score#  for  each  experimental 
condition  would  be  calculated#  thus: 

W,^  « bia»  + ba*»  + ....  bnsn 
GRAPHIC  INSPECTION 

If  the  independent  factors  are  quantitative  and  contin- 
uous, each  performance  measure  may  be  presented  graphically 
as  a "response  surface#"  which#  with  two  predictor  factors, 
has  the  appearance  of  a contour  map  with  equal  performance 
contours  (e.g.,  Figure  9a).  The  hills  and  valleys  of  these 
response  surface  contours  indicate  the  maxima  and  minima 
performance  positions  that  can  be  associated  with  the  co- 
ordinates (or  values)  of  the  independent  factors.  When 
optimum  locations  among  multiple  criteria  do  not  coincide# 
the  investigator  must  find  a way  of  studying  the  data  in 
order  to  make  the  best  and  most  practical  compromise. 

If  an  investigator  wished  to  find  the  optimum  values 
of  two  predictor  factors  for  a combination  of  performance 
measures#  the  contours  for  each  measure  could  be  drawn  on 
a common  coordinate  system  (e.g.#  Figure  9b).  However,  when 
there  are  more  than  two  or  three  predictor  factors#  this 
graphic  method  becomes  awkward  to  use  unless  it  is  meaning- 
ful to  fix  all  but  two  of  the  predictor  factors. 

Given  overlapping  response  surfaces#  for  example,  one 
showing  performance  and  the  other  showing  costs#  an  investi- 
gator may  visually  search  for  the  values  of  the  equipment 
parameters  (the  predictor  variables)  that  lead  to  some  ac- 
ceptable compromise  between  the  two  criteria. 


XI  (Resolution) 


Arrows  point  to  values  of  XI  and  X2  that  optimise  Y 


Dashed  lines 
represent  cost 
in  thousands 
of  dollars 


Solid  lines 
represent 
performance 
data 


X2  (Contrast) 


Figure  9.  Artificial  Data  Illustrating  Graphic 
Overlapping  of  Two  Response  Surfaces 


USING  LA  GRANGE  MULTIPLIERS 


When  there  are  too  many  independent  variables  to  plot 
on  a two-dimensional  piece  of  paper  (or  attempt  to  draw  as 
a three-dimensional  surface) , some  technique  other  than 
overlapping  plots  of  the  response  surfaces  must  be  used. 

A procedure  proposed  by  Umland  and  Smith  (1959)  may  be 
employed.  While  their  description  treats  the  topic  when 
only  two  criteria  measures  are  being  considered,  it  can  be 
extended  to  handle  more  criteria. 

They  propose  to  use  LaGrange  multipliers*  to  find  the 
optimum  level  of  one  fitted  second  (or  first)  order  response 
function  — subject  to  the  constraint  provided  by  a second 
fitted  second  (or  first)  order  response  function.  For  example r 
assume  we  have  two  functions,  one,  the  cost  of  building 
each  particular  equipment  configuration  (as  represented  by 
the  experimental  condition)  and  two,  the  level  of  operator 
performance  at  each  condition.  It  would  be  possible  to 
determine  the  combination  of  equipment  parameters  that  opti- 
mized performance  at  some  specified  level  while  keeping  the 
cost  of  the  equipment  within  specified  bounds.  The  converse 
could  also  be  determined,  i.e.,  the  lowest  cost  for  some 
fixed  performance  value.  The  procedure,  a general  outline  of 
which  is  illustrated  in  Umland  and  Smith's  (1959,  pp  290-291' 
paper,  is  as  follows: 


♦The  general  theory  of  LaGrange  multipliers  for  solving 
constrained  optimization  problems  is  clearly  presented  in 
R.  Courant,  Differential  and  Integral  Calculus,  1936,  Vol.  II, 

pp  188-202. 
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1.  Two  response  aurfacea  are  calculated  using  regression 

analysis  to  obtain  the  conventional  least  squares 

fit.  Only  first-  or  second-order  surfaces  can  be 

handled,  e,g.,  ¥ - e0X0  + 3X.  + 0X.XJ 

i i j 

2.  Differential  equations  are  derived  for  each  predictor 
factor  in  the  two  (or  more)  equations. 

3.  A new  set  of  non-linear  equations,  using  LaGrange 
multipliers  is  written. 

4.  These  non-linear  equations  must  be  solved  with  one 
of  a number  of  available  computer  programs,  umland 
and  Smith  (1959,  p 291)  suggest  a method  of  steepest 
ascent  as  given  by  Booth  (1955)  for  an  IBM  650 
Computer.  However,  a more  recent  program  which 
Singer  (1977)  found  useful  was  Subroutine  ZXSSQ  in 
the  IMSL  Library  1 (IBM  370  series  computer).* 
Additional  programming  is  required  to  fit  the  program 
to  this  particular  application. 

The  results  obtained  would  be  the  value  of  the  two  pre- 
dictor faators  for  the  optimum  level  of  one  criterion  con- 
strained by  some  value  of  the  second. 

Several  precautions  should  be  taken  in  using  this 
technique! 


€ 


1.  An  inspection  of  the  surfaces  individually  will 
show  whether  they  all  have  optima.  Some  surfaces 
appear  as  ridges  rather  than  peaks  which  could  cause 
the  computer  to  either  supply  numerous  correct 
answers  or,  more  likely  in  the  search  mode,  be  unable 
to  arrive  at  a solution. 


^Institute  of  Mathematics  and  Statistics  Libraries,  Inc., 
Sixth  Floor,  GNB  Bldg.,  750U  Bellaire,  Houston,  Texas  77036. 
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2.  Since  optimum  responses  may  not  fall  within  the 
limits  of  the  experimental  space , limits  must  be 
written  into  the  computer  program  to  assure  that 
the  solutions  obtained  automatically  by  the  com- 
puter will  be  useful. 

3.  Coding  the  independent  variables  can  simplify  the 
magnitude  of  certain  calculations  which  may 
overload  the  computer. 

STEP-DOWN  PROCEDURE 

If  the  investigator  cannot  assign  quantitative  values  to 
his  response,  but  is  able  to  rank  them  in  order  of  importance, 
he  may  assess  the  predictor  factors  in  terms  of  the  multiple 
responses  as  a series  of  single** response  assessments,  using 
a "step-down"  procedure  proposed  by  Roy  (1958,  p 1177)  who 
notes : 


The  atep-down  procedure  obviously  ia  not  invariant 
under  a permutation  of  the  variates  and  should  be 
used  only  when  the  variates  oan  be  arranged  on  a 
priori  grounds.  Some  advantages  of  the  atep-down 
procedure  are  (i)  the  procedure  uses  widely  known 
statistics  like  the  variance-ratio , (ii)  the  test 
is  carried  out  in  successive  stages  and  if  signi- 
ficance is  established  at  a certain  stage,  one  can 
stop  at  that  stage  and  no  further  computations  are 
needed , and  (Hi)  it  leads  to  simultaneous  aonfi - 
dence-bounds  on  certain  meaningful  parametric 
functions. 

The  investigator  would  use  an  ordinary  F-teot  at  each 
step  of  the  analysis.  He  would  begin  by  examining  the  most 
important  response  alone,  and  perform  the  analysis  of  variance 
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and  F-test  on  that.  He  would  next  ubo  the  second  most 
Important  response  to  assess  the  data  as  a uniresponse 
analysis  and  F-test,  but  it  would  be  conditional  on  the 
first  response  used.  That  is,  he  would  perform  an  analysis 
of  covariance,  y2,i,  i.e.,  response  y2  with  the  effects  of 
yi  removed.  Each  succeeding  response  measure  is  made  con- 
ditional on  all  previous  response  measures  in  the  ordered 
sequence.  This  would  continue  until  £ response  measures 
and  p independent  uniresponse  assessments  have  been  made. 

Gnanadeaikan  (1963,  p 23),  in  describing  this  technique, 
writes  the  following  in  regard  to  setting  the  probability 
value  for  rejecting  the  null  hypothesis  with  this  step-down 
procedure  t 


The  hypothesis  for  the  multiresponse  situa- 
tion is  not  rejected  if  and  only  if  none  of 
the  sequence  of  uniresponse  hypotheses  is 
rejected.  Under  the  overall  (i.e.,  com- 
plete multiresponse)  hypothesis  of  no 
treatment  effects, the  separate  F statistics 
are  independently  distributed.  Hence,  if 
aj , a2,  ap  are  the  a-risks  associated 

respectively  with  the  p F-tests,pthen  the 

overall  a-risk  is  given  by  1 - (1  - a)  . 


Roy  (1958)  describes  how  to  choose  the  value  of  the  a-risk 
(probability  of  error)  at  each  step,  so  as  to  insure  a 
desired  overall  a-risk  for  the  combined  data. 

Gnanadesikan  (1963,  p 25)  provides  an  example  of  this 
technique  including  a chi-squared-with-one-degree-of-f reedom 
probability  plot  of  the  squared  estimates  of  the  different 
effects. 
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The  uae  of  MANOVA  to  analyze  multiple  response  data  is 
analogous  to  the  use  of  analysis  of  variance  to  analyze 
single  response  data.  The  former  lakes  into  consideration 
the  fact  that  multiple  criteria  are  seldom  completely  inde- 
pendent and  may  depend  upon  one  another  or  be  hidden  aliases 
of  a single  more  fundamental  criterion.  As  with  ANOVA,  an 
investigator  may  use  MANOVA  to: 

1.  Estimate  the  probability  that  two  or  more  groups 
are  really  different,  i.e.,  that  an  observed 
effect  is  a reliable  one. 

2.  Determine  the  proportion  of  total  varianae 
accounted  for  by  each  factor,  i.e.,  eta  squared. 

Instead  of  differences  among  means,  we  examine  differ- 
ences among  centroids.  Instead  of  studying  the  variance, 
we  study  the  dispersion  of  the  multiple  responses  in  a multi- 
variate space.  Detailed  discussions  on  MANOVA  can  be  found 
in  most  references  on  multivariate  analysis  (e.g.,  Kerlinger 
and  Pedhazur,  1973;  Cattell,  1966;  Cooley  and  Lohnes,  1971). 


Making  separate  analyses  for  each  of  a number  of  response 
variables  can  lead  to  incorrect  conclusions.  Separate  re- 
sponses are  seldom  completely  independent  and  in  fact  may 
be  aliases  of  a single,  more  fundamental  criterion.  It  is 
possible  that  no  univariate  criterion  alone  would  distinguish 
among  several  groups,  while  a MANOVA  would.  This  is  illus- 
trated with  some  fictitious  data  (Figure  10)  taken  from 
Kerlinger  and  Pedhazur  (1973,  p 359).  It  can  be  seen  that 
when  the  means  of  conditions  Ai,  A2,  and  As  are  projected 
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Figure  10. 


Illustration  Showing  How  Analysis  of  Single 
ssoonses  in  Multiple-response  Experiment 
Fail  to  Detect  Real  Differences 

?rom  Kerlinger  and  Pedhasur,  1973,  P 359] 
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Yet  an  inspection  of  the  two-dimensional  plot  shows  that  the 
three  groups  are  clearly  separated.  This  is  what  MANOVA 
would  detect. 

We  will  begin  our  discussion  of  MANOVA  with  eta  squared, 
since  in  screening  designs,  this  information  would  ordinarily 
be  more  important  than  significance  tests. 

MANOVA  Eta  Squared 

Eta  squared  ( rj 2 ) from  multiple  response  analysis  of 
variance  problems  will  be  calculated  in  one  of  two  ways. 

One-way  designs.  The  first,  which  is  not  too  important 
for  screening  studios,  is  used  in  a one-way  design  with  only 
a single  factor, 


where  |W|  and  |T|  are  the  determinants  of  a within-treatment 
and  a total-treatment  matrix  respectively.*  This  is  anala- 
gous  to  the  eta  squared  for  the  single-response  ANOVA.  Eta 
squared  for  ANOVA  is  equal  to 


where  ss  and  ss.  are  within-group  sum  of  squares  and  total 

W L 

sum  of  squares  respectively.  By  subtracting  that  proportion 


In  Appendix  A of  their  book,’  Ker linger  and  Pedhazur 
(1973)  provide  an  easily  understood  short  course  in  matrix 
algebra. 
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of  the  total  accounted  for  by  the  within  group,  i.e.,  |W|/|T| 
from  one,  we  have  the  proportion  accounted  for  by  the  groups 
under  consideration. 


For  a two-response  study,  the  within- treatment  matrix, 
W,  and  the  total  treatment  matrix,  T,  would  consist  of  the 
following  elements: 

sp„ 

a a « 

t.*  l 

W 


8swl 


w ssw2 

and  the  total  treatment  matrix,  T, 

S 

8Stx 

**'lt2 


t 7: 


\sp* 


The  elements  of  the  matrices  are  calculated  as  follows. 

For  a total  of  N observations,  with  r groups  and  n observa- 
tions per  group,  the  ss  (sum  of  squares)  and  the  sp  (sum  of 
products)  are  calculated  in  the  conventional  way.  The  total 
sum  of  squares  would  be: 


n 

ss.  = EX2 
t i 


N 

(EX) 2 

-J 

N 


Between-group  sum  of  squares  would  be: 


ss. 


r 

= Z 

l 


n 

EX.  2 

i 

n 


n 

(EX)  2 

__i 

N" 


Within~group  sum  of  squares  is  obtained  by  subtraction: 


ss.  - ss  * ss., 
t b w 


These  must  be  calculated  for  each  response  measure  (1  and  2, 
in  our  example  which  we  call  X and  Y in  our  two-response 
example) . 
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The  sum  of  products  (or  sum  of  cross  products  between 
X and  Y,  or  product-sum  as  it  has  been  called)  is  calculated 
essentially  the  same  as  the  sum  of  squares,  except  that 
instead  of  multiplying  X times  X to  get  X*,  we  now  multiply 
X times  Y to  get  XY.  Similarly,  instead  of  multiplying 
EX  times  EX  to  get  (EX)2,  we  multiply  EX  times  EY  to  get 
(EX) (EY) . Thus,  for  a total  of  N observations,  with  r groups 
and  n observations  per  group,  the  total  sum  of  products  would 
be: 


spt  * EXY 


n n 
(EX) (EY) 
Tfl 


the  between-group  sum  of  products  would  be: 

sp.  - 1 <?x<  <P>  <?>  <F> 

b l - r: 

n N 


and  the  within-group  sum  of  products  would  be  obtained  by 
subtraction,  thus: 


spt  - spb  = spw 

Within  each  matrix,  the  sums  of  products  in  corresponding 
positions  on  either  side  of  the  main  diagonal,  are  the  same 
(since  spi2  is  the  same  as  spZi). 

Multifactor  (and  screening)  designs.  When  the  study 
involves  more  than  one  factor,  as  in  the  screening  design, 
and  there  are  multiple  responses,  the  equation  for  eta 
squared  is 


n2  w JLULi 

it  i 
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Note  once  again  the  analogy  between  this  eta  squared  for 
multiple-response  data  and  eta  squared  calculated  from 
single-response  data.  For  single  response  data,  eta  squared 
would  be  the  ratio  of  the  sum  of  squares  for  the  particular 
factor  or  interaction  over  total  sum  of  rquares.  In  the 
multiple-response  case,  it  is  the  ratio  of  the  determinants 
of  the  factor  matrix  (F ) plus  error  (E ) matrix  over  the 
total  matrix.* 

For  screening  designs,  the  F-matrix  represents  both  main 
arid  interaction  effects.  The  E-matrix  is  equivalent  to  the  W~ 
matrix  in  the  previous  equation  for  eta  squared,  both  being 
the  residuals  after  all  sources  of  variability  between  groups 
have  been  removed  from  the  total  variance,  or  dispersion. 

Thus,  in  MANOVA  with  multifactors,  the  between  groups  disper- 
sion can  be  partitioned  into  matrices  for  the  individual 
factors  and  the  interactions,  and  eta  squared  values  determined 


It  should  be  noted  that  in  the  first  equation  it  is 
necessary  to  work  from  the  Within  Matrix  rather  than  get  eta 
squared  from  a between  matrix  directly.  In  the  second  equa- 
tion, it  is  necessary  to  add  the  error  matrix  to  the  particular 
factor  matrix  before  finding  the  determinant.  These  are 
necessary  because  all  between-treatment  matrices  (which  include 
a factor  or  interaction  matrix)  are  singular.  That  means  that 
at  least  two  columns  (or  rows)  of  the  matrix  are  proportional 
to  one  another,  e.g.,  1 2 3 and  2 4 6;  the  determinant  of 
a singular  matrix  is  always  zero.  This  "no  solution"  situa- 
tion is  avoided  by  working  with  the  within  -groups  and  then 
subtracting,  or  by  adding  the  error  matrix  to  the  between-matrix . 
Because  of  this  restriction,  no  eta  squared  can  be  calculated 
for  a screening  design  unless~Tt  is  repeated  at  least  twice 
and1  an  error  term  is  obtained.  At  least,  the  author  was  unable 
to  find  another  solution  by  the  time  this  report  went  to  press. 
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for  each  of  them  as  in  the  ANOVA  case.  Of  course , with 
Resolution  III  designs*  interaction  terms  are  not  isolated 
from  main  effects.  With  Resolution  IV  designs  the  two- 
factor  interactions  are  in  fact  strings.  This  does  not 
change  the  calculations.  For  MANOVA,  the  sources  of  variance 
are  partitioned  in  the  same  manner  as  in  single  response 
ANOVAs.  In  a two-response  study,  for  example,  the  matrix 
for  Factor  A would  look  like  this: 


and  for  Interaction  AB,  for  example,  like  this: 


Elements  in  the  matrices  for  main  effects  are  calculated 
in  the  same  manner  they  would  be  for  the  between-treatments 
matrix.  The  only  new  elements  are  those  for  the  interac- 
tions, and  these  are  not  difficult  to  calculate  with  screening 
designs  in  which  all  the  interactions  are  linear  products  of 
two  two- level  main  effects.  Thus  the  same  equation  is  used 
to  calculate  each  element  of  the  interaction  matrices  as  the 
main  effects.  The  only  difference  in  the  calculation  is  that 
with  main  effects,  the  EX^  and  EY^  , represent  the  summing  of 
performance  scores  obtained  under  all  high  or  all  low 
conditions,  while  with  interaction  effects  one  would  sum 
either  all  conditions  in  which  both  factors  levels  were  high 
and  both  were  low,  or  one  would  sum  all  conditions  in  which 
the  factors  levels  were  always  mixed,  one  high  and  one  low. 
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These  two  sums  now  represent  the  sum  of  two  "groups'*  from 
which  sums  of  squares  and  sums  of  products  are  calculated. 

Once  the  appropriate  sum  of  squares  and  sum  of  products 
are  obtained,  the  equation  for  eta  squared  requires  that  ma- 
trices F and  E be  added.  To  add  two  matrices,  in  this  case 
F and  E,  it  only  is  necessary  to  add  the  elements  in  corres- 
ponding positions  in  each  matrix  to  form  the  matrix  sum. 

For  example: 

C •') . C I,Ai 

v5  v7 

with  4 obtained  by  adding  3 plus  1,  and  13  obtained  by 
adding  5 plus  8,  and  so  forth.  (You  can  not  obtain  the 
determinants  for  F and  E and  add  them  to  get  the  determin- 
ant for  the  sum.  One  must  sum  first  and  then  get  the 
determinant. ) 

In  Appendix  VII,  algebraic  equations  are  given  to 
calculate  the  determinants  for  2x2  and  3x3  matrices, 
used  when  there  are  two  or  three  responses  in  the  MANOVA. 

When  there  are  more  responses,  the  analysis  is  sufficiently 
complex  to  require  a computer. 

Multi-variate  Test  of  Significance 

In  multivariate  analyses,  much  attention  — possibly  too 
much  attention  — has  been  directed  at  tests  of  statistical 
significance.  Such  tests,  for  a null  hypothesis  of  "no 
effect"  against  the  completely  general  alternate  hypothesis, 
have  important  limitations.  While  a number  of  tests  have 
been  devised,  choice  among  them  is  based  largely  on  intuition. 


Wilks'  lambda  (A)  (generalized  mean)  test  is  one  of  the 
more  popular  tests  of  significant  differences  between  groups 
in  multiple  response  studies  and  will  be  described  here.  It 
determines  a probability  level  for  the  null  hypothesis  of 
equality  of  population  centroids  (mean  vectors)  on  the 
assumption  of  equality  of  dispersion  (variance-covariance 
matrices) . The  assumption  is  analogous  to  that  of  homogen- 
eity of  variance  in  the  univariate  F-ratio  test  of  equality 
of  means. 

The  equation  for  Wilks'  lambda  is: 

i . _L“L  , JU . — IlL... 

m I B + E|  | E + i\ 

Matrix  T is  equal  to  matrix  (B  + E) , which  is  not  surprising 
since  the  total  is  equal  to  the  between  plus  the  within.  We  have 
already  indicated  that  both  W and  E are  residual  matrices  that 
are  left  after  ; 11  known  sources  of  variance  have  been  removed 
from  the  total.*  In  multifactor  designs  the  B-matrix  would 
become  a matrix  (F)  for  each  particular  factor  or  interaction. 

Although  the  explicit  distribution  of  Wilks'  lambda  is 
not  known  except  for  a few  special  cases,  there  are  a number 
of  transformations  which  enable  lambda  to  approximate  the 
classical  F-distribution.  Most  of  them,  as  given,  are 
usually  suitable  only  for  the  one-way  MANOVA  design.  Tatsuoka 
(1971,  p 200)  gives  the  formula  for  Rao's  R- statistic  having 


* 

We  shall  assume  that  we  are  always  dealing  in  screening 
designs  with  a Model  I (final  effects)  experiment. 
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an  approximate  F-distribution  which  is  suitable  for  the 
multiple  independent  variable  (and  screening  design)  case, 
provided  there  is  an  estimate  of  error  variance  and  co- 
variance  possible.  The  equation  he  gives  is: 


with 


and 


with  pv^  and  ms  - (pv^/P)  + 1 degrees  of  freedom.  Also 


v * Number  of  observations  in  basic  screening 
design  multiplied  by  number  of  repeats 
beyond  the  original  plan, 

v.  ■ Number  of  groups  in  factor  being  investi- 
gated, minus  one.  In  screening  designs 
this  value  will  be  1 for  main  and 
interaction  strings. 

p * Number  of  dependent  variables. 


MANOVA  Versus  Multiple  Discriminant  Analysis 

Although  it  is  not  the  intention  in  this  report  to 
review  every  form  of  multivariate  analysis  available,  some 
comments  regarding  multiple  discriminant  analysis  as  it 
relates  to  MANOVA  may  be  helpful.  Both  techniques  may  be  used 
to  examine  one-way  designs  (single  factor,  multiple  condi- 
tions) with  multiple  response  data.  For  a given  set  of  data, 
both  techniques  will  produce  identical  overall  tests  of 
statistical  significance. 
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But  MANOVA  stops  with  this  test  of  significance,  while 
multiple  discriminant  analysis  provides  the  user  with  some 
indication  as  to  the  nature  of  the  difference.  It  does  this 
by  providing  a set  of  weights  or  coefficients  for  the  several 
dependent  measures  that  will  separate  the  mean  values  of  the 
conditions  to  the  maximum  extent.  Essentially  what  is 
happening  is  that  they  are  turning  the  original  dependent 
variables  into  new  orthogonal  dimensions  (i.e.,  canonical 
variables)  which,  like  the  factors  of  factor  analysis,  may 
not  be  readily  named.  In  certain  human  factors  for  equip- 
ment design  problems  one  may  not  find  the  orthogonal,  arti- 
ficial variables  as  useful  as  the  real  world  ones.  The 
canonical  variables  may  provide  clues  for  better  understand- 
ing, yet  the  original  variables  may  still  be  of  greater 
practical  value.  Multiple  discriminant  analyses  were 
developed  to  handle  one-way  designs.  In  a multiple-response, 
multifactor  screening  design,  there  are  separate  discrimin- 
ant analyses,  one  for  each  main  and  interaction  effect. 
Multiple  discriminant  analysis  can  be  found  in  most  books 
on  multivariate  techniques  (e.g.,  Cooley  and  Lohnes,  1971; 
Rerlinger  and  Pedhazur,  1973)  . 
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GRAPHICAL  ANALYSIS  USING  ORDERED  DISTANCES 

Wilk  and  Gnanadesikan  (1961;  1!>64)  describe  a procedure 
for  graphical  analysis  of  multiple  response  data  by  means 
of  probability  plots."  Their  procedure  represents  a gener- 
alization and  an  extension  of  the  technique  of  half-normal 
plotting  proposed  by  Daniel  (1959)  for  the  graphical  analysis 
of  single-response  data.  It  was  proposed  specifically  to 
be  used  with  two-level  factorials  where  there  is  a meaningful 
decomposition  of  the  treatment  structure  into  orthogonal 
single  degrees  of  freedom  contrasts.  It  can  also  be 
applied  to  results  from  the  fractional  factorial  and 
screening  experiments.  Where  no  independent  estimate  of 
error  is  available,  the  use  of  this  "internal  comparison"* 
method  has  several  advantages: 

1.  It  may  reveal  significant  effects  when 
single-response  analysis  does  not. 

2.  It  may  lead  to  smoother,  more  stable 
statistical  configurations  than  a single- 
response analysis. 

3.  It  provides  an  easily  assimilable 
summary  of  experimental  results  that 
facilitates  investigator  personal 
inspection  of  the  data. 


♦"Internal  comparison"  refers  to  comparisons  based  on 
a statistical  standard  set  by  the  data. 
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4.  It  helps  clarify  the  interpretation 
of  data  when  different  responses  are 
not  orthogonal  to  one  another. 

Throughout  the  many  references  to  this  technique,  the  point 
is  made  continually  that  the  intent  is  not  to  supplant  the 
marginal  analysis  of  individual  responses.  Instead,  both 
types  of  analysis  should  be  used  to  supplement  one  another. 
Roy,  Gnanadesikan,  and  Srivastava  (1971,  pp  97-112)  devote 
an  entire  chapter  to  graphical  methods  and  internal  compari- 
son evaluation  procedures  for  multiple  response  data, 
including  examples. 

General  Description 

Analogous  to  the  case  of  the  half-normal  plot,  the 
multiple  response  method  of  graphical  analysis  is  based  on 
probability  plots  of  ordered  squared  distances  (defined  as 
•'positive  semi-definite  quadratic  forms”).  Ordered 
distances  are  judged  to  be  real  when  they  deviate  consid- 
erably from  a straight  line  plotted  on  appropriately 
scaled  paper.  Several  problems  arise,  however,  with 
multiple  response  analysis  that  are  not  present  in  single 
response  analysis.  One,  in  multiple  response  analysis,  it 
is  necessary  to  approximate  and  estimate  the  distribution 
which  serves  as  the  appropriate  basis  for  the  probability 
plotg.  A procedure  for  doing  this  may  be  based  on  order 
statistics  from  the  gamma  distribution  and  tables  to 
facilitate  the  required  estimation.  Two,  while  the  uni- 
variate analysis  may  be  based  on  the  half-normal  distribu- 
tion (i.e.,  chi-square  distribution  with  one  degree  of 
freedom) , the  multivariate  analysis  uses  the  standardized 
gamma  distribution  of  a particular  shape  determined  by  the 
data.  Three,  unlike  the  univariate  case,  the  problem  of 
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linearly  ordering  multivariate  data  is  complicated  by  the 
lack  of  a convenient  measure  of  "size."  Gnanadesikan  and 
his  co-workers  have  developed  techniques  to  help  solve 
these  problems.  Only  a general  description  of  these  tech- 
niques will  be  supplied  here.  The  reader  is  referred  to 
the  original  papers  and  other  references  on  the  topic  for 
a working  knowledge. 


Gamma  distribution  paper.  This  technique  requires  that 
the  squared  distances  be  ordered  and  plotted  against  the 
corresponding  quantiles  of  the  gamma  distribution.  Psychol- 
ogists are  familiar  with  special  cases  of  the  gamma 
distribution,  e.g.,  the  chi-square  and  exponential  distri- 
butions. Unfortunately,  unlike  the  uniresponse  procedure 
proposed  by  Daniel  for  which  special  "probability"  paper  can 
be  prepared,  no  single  general  probability  papei  can  be 
prepared  for  the  gamma  distribution.  This  is  because  the 
distribution  can  be  standardized  through  a linear  trans- 
formation for  only  two  of  the  three  parameters  defining  the 
distribution,  that  is,  for  the  origin  and  the  scale,  but 
not  for  the  shape.  Special  approximation  tables  or  a high- 
speed computer  are  required  to  calculate  the  actual 
percentage  points  of  ordered  effects.  Wilk,  Gnanadesikan, 
and  Huyett  (1962)  and  Roy,  Gnanadesikan,  and  Srivastava 
(1971)  provide  tables  of  percentage  points  for  the  reduced 
gamma  distribution,  together  with  the  numerical  procedures 
and  approximations  employed.  Wilk,  et  al  (1962,  pp  102-103) 
describe  the  procedure  step  by  step  and  note  that  the 
entire  procedure  is  mechanized  and  in  u.»e  at  Bell  Telephone 
Laboratories  for  the  IBM  7094  and  GE  63b  computers. 

Computer  programs  for  these  calculations  are  also  given  in 
Roy,  et  al  (1971) . 
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The  effect  of  a factor  in  the  univariate  case  is  the 
mean  difference  in  performances  between  high  and  low  levels 
of  the  factor.  With  multiple  responses,  the  measure  of  the 
main  effect  of  a factor  would  be  the  "distance"  between  the 
high  and  low  centroids  in  the  multi-dimensional  response 
surface.  For  example,  if  there  were  three  independent 
factors  with  two  levels  in  each  and  two  responses,  one 
night  graphically  represent  the  data  as  shown  in  Figure  11. 
The  performances  on  conditions  involving  high  and  low  levels 
for  Factor  A are  indicated  by  squares  and  circles, 
respectively.  The  centroids  are  the  darkened  symbols. 

Roy,  et  al  (1971)  describe  the  calculation  this  way: 


; 


Figure  11.  Geometric  Representation  of  One  Main  Effect,  A, 
in  a 2s  Experiment  with  Two  Responses.  [From 
Wilk  and  Gnanadesikan  (1964,  p 619,  Fig.  1.] 


For  the  bivariate  response,  therefore,  a 
natural  measure  of  the  main  effect  A would 
be  the  "distance"  between  the  centroids  in 
the  two-dimensional  response  space.  If  x j 
is  the  contrast  vector  corresponding  to  the 
main  effect  A,  then  the  "distance"  between 
the  two  centroids  is  proportional  to  the 
"length"  of  ®,.  For  instance,  choosing  the 
compounding  matrix  -4 , in  the  defining  equa- 
tion 

jdA  * x^x!^,  i * 1,2, . . .L(<n-1)  responsesj 

as  the  identity  matrix  of  order  2 in  this 
case,  so  that  d^  * x{xi,  we  get  the  squared 
Euclidian  distance  between  the  two  centroids 
corresponding  to  the  definition  of  the  main 
effect  A.  More  generally,  the  (n-1)  contrast 
vectors  xj's  may  be  visualized  as  (n-1) 
points  in  the  p-dimensional  space,  as 
squared  lengths,  or  squared  distances  from 
the  origin,  associated  with  the  contrast 
vectors . 


Selecting  the  compounding  matrix.  The  defining  equation, 
written  with  matrix  symbols,  can  be  expanded  to  look  like  this: 


Squared 

Distance, 


COMPOUNDING 

MATRIX 


> 


It  is  necessary  for  the  investigator  to  arbitrarily  specify 
the  values  of  the  a weights  of  the  compounding  matrix  with 
the  single  restriction  that  the  squared  distances  are 
greater  than  or  equal  to  zero.  Symbolically: 


a?  '/lx  > 0 


Wilk  and  Gnanadesikan  (1961,  p 1210)  state  that  the 
elements  in  the  A matrix  are  non-negative  definite 
quadratic  forms.  Some  possible  examples  of  the  A matrix 
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might  be  a)  the  identity  matrix,  (I)  , b)  a diagonal  matrix 

of  reciprocals  of  estimates  of  the  variances  of  the  p 

responses,  (A—-)  , or  c)  the  inverse  of  the  covariance  matrix 
®ii  _ t 

of  the  original  responses,  (5  ) . 

The  inverse  of  the  covariance  matrix,  s-1,  is  a 
particularly  useful  compounding  matrix  since  it  provides  a 
linear  invariance  and  makes  statistical  allowance  for 
differing  variances  and  correlations  among  the  elements  of 
the  effects  vectors.  However,  it  is  recommended  that  an 
s’1  matrix  be  derived  from  the  sum  of  squares  and  sum  of 
products  of  r effects  (contrast)  vectors,  where  r is  a 
subset  of  the  total  number  of  effect  vectors.  In  the  case 
of  ordered  values,  the  subset  of  r vectors  might  include  the 
smaller  half  of  the  effects.  This  removes  the  larger  effects 
from  the  estimates,  for  if  they  are  real,  including  them 
would  reduce  the  number  of  effects  that  would  appear  to 
stand  out  from  the  rest.  Excluding  them  gives  the 
smaller,  but  real  effects  a better  chance  of  being  detected.* 


Two  other  useful  compounding  matrices,  j and  Djfj-j-  i are 
diagonal  matrices.  The  diagonal  matrix  with  weights 
inversely  proportional  to  estimated  variances,  has  been 
found  to  yield  a more  sensitive  analysis  than  equal  weighting 
as  long  as  the  estimated  variances  are  based  on  the  smaller 
half  of  the  ordered  effects  vectors  (as  proposed  for  S-1). 


Roy,  et  al,  recommend  that  several  different  compounding 
matrices  be  tried  in  estimating  the  squared  distance  and  the 
researcher  should  realize  that  whatever  compounding  matrix 
is  used,  subsequent  inferences  regarding  the  data  should  be 
"conditional"  on  this  choice. 


* 

Note  the  similarity  between  that  tactic  and  that  proposed 
by  Zahn  when  he  calculates  the  standard  deviation  for  the  half- 
normal plot  (see  p 97,  this  report) . 
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Analyzing  Subgroups 

Graphical  internal  comparison  procedures  may  also  be 
applied  to  subgroups  of  the  effects  vectors,  selected  ac- 
cording to  meaningful  criteria  which  are  independent  of 
the  data.  For  example,  one  might  look  at  different  orders 
of  effects  separately,  e.g.,  main  and  two-factor  inter- 
actions, or  isolate  all  higher- than- second-order  interac- 
tions and  examine  them. 

Plotting  and  Evaluating  the  Ordered  Distances 

It  has  already  been  stated  that  under  the  null 
hypothesis,  i.e.,  no  systematic  effects,  the  ordered 
distances  would  behave  like  a random  sample  from  a gamma 
distribution  with  its  density  defined  by  origin,  scale, 
and  shape  parameters.  By  keeping  the  origin  at  0 and  the 
scale  at  1,  only  the  shape  parameter  is  unknown.  If  it 
were  known,  then  when  the  ordered  distances  were  plotted 
against  corresponding  quantiles  of  the  gamma  distribution, 
the  points  would  appear  in  a straight-line  configuration 
if  there  are  no  real  effects.  Major  departures  from  the 
straight  line  by  the  largest  effects  will  suggest  that 
those  effects  are  probably  real. 

Conclusion 


While  there  is  much  to  learn  before  one  can  comfortably 
use  this  graphical,  internal  comparison  method,  there  seems 
to  be  sufficient  justification  to  apply  it  to  screening 
problems.  Since  without  replication,  the  screening  plans 
have  no  independent  estimate  of  error  variance  to  test 
the  significance  (reliability)  of  an  effect,  this  internal 
comparison  procedure  serves  as  a useful  alternative.  Before 
anyone  can  assess  how  valuable  the  technique  is,  more  ex- 
perience is  needed  in  using  it  and  applying  it  to  behavioral 
data. 

157 


C 


CANONICAL  CORRELATION  ANALYSIS 

Canonical  correlation  analysis  is  the  generalization 
of  univariate  multiple  correlation  analysis  to  two  sets  of 
variables i usually,  but  not  always,  multiple  independent 
and  multiple  dependent  variables.  Canonical  analysis  provides 
a measure  of  the  degree  of  association  between  the  two  sets 
of  variables  and  may  be  useful  for  learning  something  about 
the  underlying  relationships  among  the  variables  of  the  two 
sets. 


Applications 

Examples  of  two  sets  of  multivariate  data  to  which 
canonical  correlation  analysis  might  be  applied  to  deter- 
mine the  degree  of  association  and  underlying  relationships 
are: 


1.  Flight  performance  measures  at  the  beginning 
and  the  end  of  a training  program. 

2.  Instructors’  characteristics  versus  trainees' 
flight  performance  measures. 

3.  Instrument  design  factors  versus  multiple 
cost  criteria  (e.g.,  dollars,  performance). 

4.  Pilot  selection  test  scores  versus  flight 
performance  data. 

5.  Pilot  training-simulator  design  parameters 
versus  multiple  transfer-of-training 
criteria. 
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Process 

Many  books  have  been  written  on  canonical  correlation 
analysis,  the  theory  and  the  mathematics,  (e.g.,  Cattel, 
1966;  Kerlinger  and  Pedhazur,  1973;  Nie,  Hull,  et  al,  1975; 
Bock  and  Haggard,  1968;  Tatsuoka,  1971).  These  will  not 
be  discussed  here.  Although  that  background  is  important, 
at  the  end  of  this  section  an  improved  canonical  analysis 
will  be  described.  Therefore  at  this  time,  only  the 
fundamental  process  involved  in  the  canonical  analysis  will 
be  discussed. 


We  begin  with  a table  showing  the  coordinates  of  the 
experimental  space  at  which  the  data  was  collected  and 
the  set  of  measures  made  on  that  set  of  conditions.  In 
screening  designs,  the  coordinates  are  the  conditions  of 
a fractional  factorial  and  therefore,  orthogonal.  The 
response  measures  are  almost  always  correlated.  Thus  the 


Prom  this  data  a table  of  intercorrelations  is  con- 
structed by  finding  the  correlation  between  every  pair  of 
columns,  and  locating  them  in  the  intercorrelation  table 
as  follows: 


e 

raw  data  matrix  for 

three 

independent 

and 

two  dependent  ! 

variables  would  look 

like 

this : 

1 

Set  1 

Set 

| 

2 1 

it 

Observation 

Independent 

Dependent  ! 

Number 

A 

B C 

X 

Y 1 

1 

-1 

-1  -1 

.3 

14  | 

2 

+1 

-1  -1 

.7 

2i  | 

$: 

3 

-1 

+1  -1 

.1 

13  j 

4 

-1 

+1  -1 

.5 

11  ! 

. . .etc 

etc  etc. . , 

J 

l 

Independent  Dependent 


A 

B 

c 

X 

y_ 

A 

raa 

rab 

rac 

rax 

r*y 

(I) 

B 

rba 

rbb 

rbc 

rbx 

rby 

C 

rca 

rcb 

r 

cc 

rcx 

rcy 

X 

r 

xa 

rxb 

r 

xc 

r 

XX 

r 

xy 

(D) 

Y 

r 

ya 

ryb 

r 

yc 

r 

yx 

r 

yy 

which  can  be  simplified  using  matrix  algebra  and  symbols  as: 


*11  | 

^2 

*21  | 

*!2 

where  R is  the  entire  correlation  matrix,  represents 
the  correlations  among  the  independent  variables,  R 
represents  the  correlations  among  the  dependent  variables, 
^12  “^presents  the  correlations  between  independent  and 
dependent  variables,  and  represents  the  transpose  of 

^12- 

Computer  programs  exist  that  would  work  from  the  data 
in  the  above  matrix  to  find  the  solution  to  the  canonical 
correlation  analysis.  This  in  essence  is  what  it  would  do. 
It  would  search  out  a set  of  weights  (i.e.,  Beta  coeffi- 
cients) to  assign  to  the  independent  variables  and  another 
set  of  weights  to  assign  to  the  dependent  variables.  With 
these,  two  sets  of  canonical  variates  would  be  calculated. 

A "variate"  is  a rotated  dimension  in  the  multivariate 
space  made  up  of  composite  scores  derived  from  the  weighted 
values  of  the  two  sets  of  raw  data. 
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The  weights  for  the  two  sets  are  selected  in  a way  that  will 
cause  the  correlation  between  the  pair  of  variates  to  be  a 
maximum.  The  square  of  this  correlation  indicates  the 
proportion  of  the  variance  of  the  single  criterion  composite 
accounted  for  by  the  predictor  composite.*  Next,  a 
second  pair  of  variates  could  then  be  calculated  that  would 
account  for  as  much  as  possible  of  the  variance  between 
the  two  sets  that  were  left  unaccounted  for  by  the  first 
pair  of  variates.  This  procedure  can  continue,  the  maximum 
number  of  iterations  being  equal  to  the  number  of  variables 
in  the  smaller  of  the  two  groups.  Each  new  pair  of  variates 
is  completely  orthogonal  to  all  previous  pairs  of  variates. 
It  may  not  be  necessary  to  complete  them  all  since  most  of 
the  variance  may  be  accounted  for  by  the  first  few  pairs. 


Since  the  new  variates  are  formed  in  pairs,  the  existence 
of  large  weighting  (coefficients)  on  the  old  variables  in 
the  two  groups  would  identify  which  ones  were  responsible 
for  the  degree  of  correlation  that  was  found.  For  example, 
an  idealized  result  might  oe: 


Old 

Variables 


New  Variates 
I II 


Group  I 


1 

2 

3 

4 


H 

H 

L 

L 


L 

L 

H 

H 


Coefficient* 

H * high  weight 
L * low  weight 


5 L 

Group  II  6 L 

7 H 

8 H 

Canonical  Correlations  (.85) 


H 

H 

L 

L 

(.75) 


i m 


t I 


Thorndike  (1975)  discusses  general  considerations  in  interpreting  \ 
canonical  correlations  and  specifically  (pp  82-83)  some  problems  in  inter-  J 
preting  the  index  of  proportion  of  variance.  A "redundancy  index"  is  3 
proposed  instead. 
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These  results  could  be  interpreted  as  follows.  In  the 
first  variate  (I) , approximately  72%  of  the  variance  due  to 
Variables  1 and  2 were  accounted  for  by  Variables  7 and  8. 

In  the  second  variate  (II)  , approximately  56%  of  the 
remaining  variance  (after  Variate  I was  discounted)  in 
Variables  3 and  4 were  accounted  for  by  Variables  5 and  6. 

As  in  factor  analysis,  it  may  be  possible  to  find  the 
common  element  among  the  heavily  weighted  variables  to  be 
able  to  name  the  variates  in  the  two  groups  of  data. 

Limitations  of  Canonical  Correlation  Analysis 

With  real  data,  these  clear  cut  divisions  and  associa- 
tions found  in  the  above  example  seldom  occur.  The  problem 
of  interpretation  may  be  difficult.  Trying  to  "name”  the 
new  variates  may  also  be  difficult. 

Perhaps  the  major  limitations  of  a canonical  correlation 
analysis  lies  in  the  unreliability  of  the  weights.  The 
problems  that  arise  in  trying  to  examine  the  coefficients  of 
individual  terms  in  multiple  regression  problems  when  the 
variables  are  correlated  are  only  complicated  further  in 
these  bilateral  regression  analyses.  Hoerl  and  Kennard 
(1970a,  b) , cite  the  following  characteristics  of  coef- 
ficients estimated  from  ill-conditioned  experimental  designs: 

1.  The  coefficients  become  too  large  in  absolute 
value. 

2.  Some  coefficients  have  the  wrong  sign. 

3.  Collectively  the  coefficients  are  unstable; 
another  set  of  performance  data  would  be 
unlikely  to  give  the  same  beta  values. 
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4.  Individual  coefficients  may  be  over  or  under 

estimates  of  the  strength  of  a particular  factor. 


To  try  to  interpret  the  results  from  a canonical  correlation 
analysis  by  examining  the  individual  weights , therefore, 
seems  to  be  overly  optimistic.  The  more  non-orthogonal  the 
original  matrices,  the  less  reliance  can  be  placed  on  the 
interpretation  of  individual  coefficients,  (See  Simon, 

1975  for  more  discussion  of  this  problem.) 

An  Improved  Method  of  Canonical  Correlation  Analysis 


3 


Hoerl  and  Kennard  (1970a,  b)  proposed  to  use  "ridge 
regression"  to  improve  the  analysis  of  an  ill-conditioned 
multiple  regression  matrix.  This  analysis,  they  suggest, 
will  obtain  a better  prediction  equation  in  which: 

1.  The  estimated  coefficients  will  be  closer  to 
the  true  coefficients  on  the  average. 

2.  The  signs  attached  to  the  coefficients  will  be 
more  meaningful. 

3.  A point  estimate  of  a response  can  be  made 
with  a smaller  mean  square  error. 

4.  The  coefficients  will  be  more  stable  and  likely 
to  be  repeated  if  new  data  is  taken. 

Hoerl  and  Kennard’ s (1970a,  b)  original  papers  provide  a 
description  of  the  philosophy  and  underlying  mathematics 
for  ridge  regression  analysis.  A simpler  explanation  has 
been  provided  by  Simon  (1975)  and  will  not  be  repeated  here. 
Mechanically  what  is  done  is  to  add  a small  constant  to  the 
unit  diagonal  of  the  intercorrelation  tables,  and  then 
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analyze  this  modified  * ata  by  a multiple  regression  analysis 
as  usual.  Finding  the  proper  constant  (usually  less  than 
.05)  depends  on  a study  of  a plot  of  the  coefficients 
obtained  with  each  constant  after  trying  a range  of  values. 

A number  of  studies  have  found  that  for  highly  correlated 
matrices,  ridge  regression  analysis  provides  a more  stable 
set  of  coefficients  and  a smaller  prediction  error  than 
conventional  multiple  regression. 

Carney  (1975)  proposes  using  ridge  regression  analysis 
rather  than  multiple  regression  analysis  to  obtain  canonical 
correlations.  As  with  the  single  response  case,  this  would 
reduce  the  instability  and  the  errors  in  the  estimates  of 
the  weights  used  to  obtain  the  canonical  variates.  He 
developed  a computer  program  that  would  provide  Monte  Carlo 
data  to  evaluate  and  solve  the  ''canonical  ridge 
estimates”  (Carney  and  Anderson,  1974) . 

The  investigator  must  decide  what  constant,  k,  to 
add  to  the  unit  diagonals  of  two  matrices,  and  i?^,  ^or 
the  canonical  ridge  analysis.  Carney  (1975,  p 9)  says: 

"There  seems  to  be  no  theoretical  criterion  for  choosing 
k-values  for  canonical  ridge  estimates"  but  he  suggests 
several  possible  empirical  approaches: 

1.  Try  a series  of  k-values  and  select  the  solutions 
in  which  the  coefficients  appear  not  to  change 
much  over  a range  of  k's.  (This  is  feasible  for 
ridge  regression,  with  a single  set  of  coefficients, 
but  can  be  more  difficult  with  the  many  coefficients 
in  the  canonical  ridge  case) . 

2.  Limit  the  application  of  ridge  to  the  first 
canonical  correlates  only. 
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3.  Proceed  ea  in  the  Monte  Carlo  experiments,  treating 
the  sample  covariance  matrix  as  if  it  were  a 
population  matrix,  generating  artificial  samples, 

- and  selecting  k-values  to  minimise  "mean  square 
error," 

4.  Perturb  the  data  matrix  and  attempt  to  find  k-values 
for  which  the  perturbations  have  little  effeot. 

5.  Subdivide  the  sample  and  select  k-values  for 
which  stability  across  subsamples  occur. 
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IX.  EVALUATING  THE  ADEQUACY  OF  THE  REGRESSION  EQUATION 

One  of  the  better  features  of  central-composite  designs 
is  the  procedure  that  enables  the  investigator  to: 

o Collect  data  sequentially  in  blocks,  beginning 
with  only  enough  for  a first-order  model  when 
no  function  is  assumed 

o Determine  whether  the  order  model  adequately 
fits  the  actual  data 

o Collect  more  data  when  lack  of  fit  is  significant 
in  order  to  fit  the  next  higher-order  model. 

The  analysis  of  variance  of  the  classical  central- 
composite  designs  (Box  and  Wilson,  1951;  Box  and  Hunter, 

1956;  Simon,  1970b,  1973),  composed  of  2^”^  fractional 
factorials  and  center  points  in  the  first-order  model  plus 
"star"  points  in  the  second-order  model,  would  ordinarily 
take  the  form  of  these  examples: 

First  Order  (3  factors,  4 center  points,  12  observations) 


Source  d . f . 

First  order  terms  3 

Xi  1 

X2  1 

x3  1 

Lack  of  fit  5 

Error  3 


*Most  of  the  material  for  this  section  was  taken  from  a 
paper  by  Draper  and  Herzberg  (1971).  Mr.  Edward  J.  Dragavon 
helped  interpret  the  paper  and  prepare  the  example. 
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Second  Order  (3  factors,  6 center  points,  30  observa- 


tions) 

Source  d.f . 
First  order  terms  3 
Second  order  terms  6 
Lack  of  Fit  5 
Error  5 


Draper  and  Herzberg  (1971)  show  how  the  lack  of  fit  in  each 
of  these  two  types  of  designs  — first  or  second-order  — 
can  be  split  into  two  sources  that  can  help  the  investigator 
decide  where  the  lack  of  fit  (bias)  lies  and  what  his  next 
step  should  be. 

SPLITTING  THE  LACK  OF  FIT  OF  THE  FIRST-ORDER  DESIGNS 

The  sum  of  squares  for  the  first-order  lack  of  fit 
can  be  split  into: 

Li : Sum  of  squares  due  to  lack  of  fit  of 
the  interaction  effects 

L2:  Sum  of  squares  due  to  lack  of  fit  of 
curvature 

The  calculation  for  L2  sum  of  squares  for  estimating  curva- 
tures* lack  of  fit  is  given  by  Draper  and  Herzberg  (1971), 
Cochran  and  Cox  (1957,  p 342),  Peng  (1967,  p 160),  and 
Meyer  (1971,  p 116)  is: 

n n 

Sum  of  squares  L2  = — — ^ — (y  - y ) 2 

n+n  1 2 

t 2 
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where: 

ni  » Number  of  replicated  center  points 

n2  a Number  of  non-center  points  (fractional 
factorial  portion) 

yi  « Mean  response  at  center  points 

y2  * Mean  response  at  non-center  points 
I<2  has  one  degree  of  freedom  and  is  the  sum  of  the  Bii 
aliased  in  a single  string. 

The  Li  sum  of  squares  (for  estimating  interaction  lack 
of  fit)  can  be  calculated  as  follows: 

Sum  of  Squares  Li  ■ j Total  Lack  of  Fit  I minus  1L2  sum  of  squares! 

I sum  of  squares  j I j 

Li  has  one  less  degree  of  freedom  than  the  total  Lack 
of  Fit  sum  of  squares  had. 

Variances  are  formed  for  Li  and  L2  by  dividing  the  stun  of 
squares  by  the  degrees  of  freedom.  These  can  be  tested  for  sig- 
nificance using  the  error  term  in  the  conventional  way.  If 
there  are  so  few  degrees  of  freedom  in  the  error  term  of  the 
unreplicated  basic  central-composite  design  as  to  make  the 
power  of  such  a test  questionable , it  would  tie  wiser  for  the 
investigator  to  inspect  the  relative  magnitudes  of  the 
proportions  of  variance  accounted  for  by  each  of  the  sources 
of  variance.  (See  Simon,  1976a) , 

Meyer  (1971,  p 116)  shows  how  this  technique  would  be 
used  with  a fractional  factorial  Resolution  IV  design 
augmented  with  center  points.  In  his  analysis  (p  117),  he 
isolated  all  linear  model  terms  plus  lack  of  fit  and  then 
error.  The  four  degrees  of  the  lack-of-fit  term  were  further 
isolated  into  3 degrees  of  freedom  for  the  cross-product 
sources  (Li)  and  one  degree  of  freedom  for  the  quadratic 
sources  (L2).  In  this  2*'1  design,  the  3 degrees  of  freedom 
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for  the  cross-product  source  were  actually  for  three 
strings  each  with  two  two-factor  interactions  aliased  with 
one  another.  The  single  degree  of  freedom  for  the  quad- 
ratic source  represents  the  sum  of  the  coefficients  of  all 
quadratic  terms. 

While  the  wording  in  Draper  and  Herzberg's  paper  (1971, 

p 234,  para  3.1)  seems  to  suggest  that  this  splitting  of 

the  lack  of  fit  in  a first-order  model  is  appropriate  only 
k-o 

when  the  2 * fractional  factorial  design  is  of  "resolution 

greater  than  four,"  this  is  not  the  case.  This  procedure 
then  could  be  used  with  Resolution  IV  screening  designs  to 
determine  whether  an  observed  lack  of  fit  is  the  result  of 
inadequate  curvature  of  cross-product  information,  or  both, 
in  the  first  order  model. 

Meyer  (1971,  p 123)  later  makes  an  important  point 
when  he  warns  his  readers  that  the  aggregate  sources  of 
variance  that  make  up  the  lack  of  fit  will  differ  depending 
on  the  experimental  design.  He  writes:  "Essentially,  they 
represent  terms  that  the  experimenter  could  have  included 
in  the  model  but  didn't."  Thus,  if  a lack  of  fit  test  is 
not  significant,  implying  an  adequate  representation,  the 
investigator  should  be  sure  that  the  terms  of  interest  are 
included  in  the  design.  Otherwise,  prediction  will  suffer. 

SPLITTING  THE  LACK  OF  FIT  OF  SECOND  ORDER  DESIGNS 

Draper  and  Herzberg  (1971,  p 235)  specify  that  this 
procedure  for  splitting  the  Lack  of  Fits  sum  of  squares  for 
a second  order  central-composite  design  should  be  used  only 
when  the  cube  part  of  the  design  is  Resolution  VII  or  higher . 
A Resolution  VII  design  enables  all  main  and  two-  and  three- 
factor  interaction  effects  to  be  isolated  from  one  another. 

In  this  case,  L':  is  used  to  check  for  fourth  order  biases. 
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Since  for  most  psychological  research  valid  fourth  order 
effects  are  extremely  unlikely  (Simon,  1976b)  any  signifi- 
cant Lack  of  Fit  of  the  L*2  term  would  suggest  that 
unwanted  sources  of  variance  are  distorting  the  data. 

Calculations.  L'i  will  provide  a test  of  third  order 
biases.  The  calculation  of  L*2  for  the  second  order  model 
is  more  complicated  than  for  the  first  order  model.  Draper 
and  Herzberg  (1971,  p 235)  provide  the  following  equation: 

L’j  SS  * d(l  + dt)  jt(n-d)yi  - yQ  + s^E^  ai^  * 

The  meaning  of  each  symbol  is  given  in  Table  16*  The 
L * 2 SS  has  one  degree  of  freedom. 

Li  is  obtained  by  subtracting  the  sum  of  squares  for  L'2 
from  the  total  Lack  of  Fit  sum  of  squares,  thus: 

L’l  SS  « (Total  LoF  SS)  - (L’2  SS) 

The  L' i SS  has  one  degree  of  freedom  less  than  the 
Total  SS. 

If  the  second-order  design  is  orthogonally  blocked, 
the  sum  of  squares  for  blocking  can  be  removed  as  usual. 

Since  L’ i is  found  by  subtraction,  removing  the  sum  of 
squares  for  blocks  will  reduce  the  size  of  L'j  but  will  not 
affect  L' 2 • 


170 


TABLE  16 


SYMBOLS  USED  IN  EQUATIONS  TO  CALCULATE  L'2  SUM  OF  SQUARES 
FOR  THE  SECOND-ORDER  CENTRAL-COMPOSITE  DESIGN 


Number  of  center  points 


* Total  number  of  observations 

“ Number  of  factors  {independent  variables) 

* Sum  of  non-center  point  coefficients  squared 

* Sum  of  non-center  point  coefficients  raised  to  4th  power 

■ Sum  of  cross  products  between. any  pair  of  coefficients 
squared  over  all  non-center  points* 

**  Mean  performance  at  non-center  points 

* Mean  performance  at  center  points 

, g + h (k  - 1) 

(n  - d)  [g  + h(k  - 1)]  - kc2 

-ct 

g + h(k  - 1) 

=*  Sum  of  cross  products  between  performance  and  coefficients 
squared  of  factor  i over  all  non-center  points  (where 
i = l,2,...k) 


In  the  conventional  central-composite  design,  this  sum  will 
equal  the  number  of  non-center  points. 


o 


EFFECTS  OF  REPLICATING  NON-CENTER  POINTS  OF  THE  CCD 


Draper  and  Herzberg  (1971,  p 233)  comment  on  this 
stating  that  " . . . if  the  center  points  are  not  the  only 
replicated  points  in  the  design  there  are  slight  changes 
in  the  above  which  do  not  materially  affect  the  situation.” 
They  cite  some  notational  changes  that  might  be  made  but 
indicate  that  it  would  not  be  necessary  to  make  any  changes 
in  the  calculation  of  L2  or  L'2.  Although  Li  and  L'i 
would  be  affected  by  the  change,  the  computations  remain  the 
same  since  their  sums  of  squares  is  obtained  by  subtraction. 

ADDITIONAL  CRITERION  FOR  EVALUATING  THE  EQUATION 

Suich  and  Derringer  (1977,  p 213)  note  that  "...  the 

significance  of  the  regression  F- ratio  and  the  nonsignifi- 

* 

cance  of  the  lack~of~fit  F-ratio  do  not  necessarily  imply 

A 

that  Y (X)  is  an  adequate  [predictive]  model."  At  best, 
when  the  regression  F-ratio  exceeds  the  critical  F value 
for  significance,  this  only  indicates  that  the  fitted  equa- 
tion is  probably  a better  predictor  of  performance  than  the 
mean  of  the  data  would  be.  Such  information  is  of  little 
practical  value.  Draper  and  Smith  (1966,  p 64)  suggest  that 
".  . . unless  the  range  of  values  predicted  by  the  fitted 
equation  is  considerably  greater  than  the  size  of  the  random 
error,  prediction  will  often  be  of  no  value  even  though  a 
'significant'  F-value  has  been  obtained,  since  the  equation 
will  be  'fitted  to  the  errors'  only."  J.  M.  Wetz  (1964),  a 
student  of  G.  E.  P.  Box,  in  a Ph.D.  dissertation,  suggested 
that  the  F-ratio  of  the  equation  would  have  to  exceed  some 
criterion  F-value  by  about  a factor  of  four  to  be  rated  as 
a satisfactory  prediction  tool. 
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Suich  and  Derringer  (1977)  provide  . .a  numerical 
criterion, y » which  quantifies  the  range  of  values  predicted 
by  [a  second  degree  polynomial]  relative  to  the  size  of  the 
standard  error.  That  is,  the  importance  of  the  standard 
error  is  considered  in  light  of  the  magnitude  of  the  changes 
to  be  estimated  by  the  model  itself  . . (p  213).  This 
equation  is: 


Y 


1 m E(Y 
i*:i 


where 

Y^  = Each  performance  score 
Y = Mean  performance 

m = Number  of  terms  in  equation  excluding  the  constant 
n = Number  of  observations 


Calculation  and  Test 


Instead  of  wishing  to  compare  the  F-value  obtained  by  the 
usual  method: 


F = Regression  mean  square 
Error  mean  square 

with  the  standard  F-value  taken  from  a central-F  distribution 
(published  in  most  statistics  books  that  deal  with  the 
analysis  of  variance),  that  is,  to  test  the  hypothesis  that  Y 
is  or  is  not  greater  than  some  non-zero  value  considered  to 
be  an  important  difference  for  a particular  situation.  To  do 
this,  they  develop  an  equation  to  calculate  a non-central 
F-value  to  compare  with  the  F obtained  from  the  experimental 
data.  This  non-central  F (i.e.,  F*  _ _ . v2)  can  be 

estimated  for  any  risk  level,  a,  and  particular  pairs  of 
degrees  of  freedom,  m and  (n-m-l)  by  adjusting  the  standard 


£ 
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F-value  found  in  the  conventional  tables.  This  relationship 
is: 


F« 

a,ra,n-m- 


i.y 


2 


(1  + Y 2)  F 


a,b,n-m-l 


where : 


b « ">(!■«•  W 

(1+2  Y2) 

and  (n-m-1)  is  the  degrees  of  freedom,  and  a is  the  accep- 
table risk  level  of  committing  a Type  I error  (i.e.,  stating 
that  a difference  exists  when  in  fact  it  doesn't). 

The  y substituted  in  this  equation  is  not  calculated 
from  the  data,  but  is  the  degree  of  variation  required  for 
importance.*  To  select  an  F*  to  be  approximately  four  times 


We  could  decide  to  use  a Y value  calculated  from  the 
data  using  the  aforementioned  equation  and  substitute  that 
into  the  equation  relating  F*  to  F,  but  reversed  thus: 


(1+  Y 2 ) 

with  the  appropriate  degrees  of  freedom  indicated  above  for 
both  F and  F' . Then  by  using  the  standard  F-distribution 
tables,  along  with  some  interpolation,  we  could  find  the 
risk  level,  a,  for  accepting  the  equation  as  a predictor  by 
searching  the  table  for  the  F value  for  the  indicated  degrees 
of  freedom  closest  to  the  one  calculated  above.  One  would 
need  a set  of  F-tables  that  gives  F-values  for  a range  of 
probability  values  (e.g.,  Fisher  and  Yates,  1963). 


rW**»*-M*V**JU 


the  size  of  F (as  Wetz  had  suggested) , then  making  Y equal 
to  2 would  roughly  produce  that  result.  However , the 
decision  of  how  large  this  value  should  be  is  up  to  the 
investigator  and  a matter  of  experience.  The  experiences 
of  the  statisticians  who  have  suggested  the  value  might  be 
four  were  not  working  with  human  performance  data  — more 
likely  it  was  chemical  engineering  data  — so  we  will  have 
to  try  it  and  see  how  it  works.  Certainly  any  more  critical 
criterion  than  the  one  currently  in  use  is  likely  to  produce 
a better  predictive  equation,  although  Suich  and  Derringer 
say  it  " . . .is  not  meant  to  be  a final  answer  to  the 
problem  but  more  as  a benchmark  or  rule-of-thumb  to  help 
in  answering  this  difficult  question.  . ."  (p  216). 

If  the  regression  F-value  is  less  than  F',  the  investi- 
gator would  reexamine  two  things:  1)  is  his  error  variance 
too  large  because  of  too  small  a sample?  2)  is  the  equation 
model  adequate  or  should  it  be  expanded?  Both  require  more 
data  to  be  collected.  If  the  regression  F equals  or  is 
larger  than  F',  then  we  have  increased  our  confidence  in  the 
equation  as  a predictive  model.  Suich  and  Derringer  provide 
an  example  of  this  test  (pp  214-216) . 


x.  analyzing  the  data  from  an 

INCOMPLETE  SCREENING  EXPERIMENT 


An  experiments:.  may  be  required  to  do  an  analy- 
sis "on-line"  each  time  a new  piece  of  data  has  been  collec- 
ted. For  example,  he  may  wish  to  check  his  results  as  soon 
as  the  data  is  collected  in  order  to  decide  whether  to  stop 
or  to  modify  the  experimental  program.  Or,  he  may  wish  to 
keep  abreast  of  the  data  in  the  event  the  experiment  is  inad- 
vertently terminated  prematurely.  While  a regression  analysis 
can  be  performed  relatively  quickly  with  a modern  computer, 
it  may  not  be  convenient  or  may  be  too  costly  to  make  one 
available  for  this  purpose. 

Hunter  (1964)  has  provided  a "predictor-corrector"  (P-C) 
equation  that  can  be  used  to  determine  the  regression  coeffic- 
ients in  a polynomial  model  after  the  data  has  been  collected 
on  each  experimental  condition  of  a screening  design  (or  for 

k k-D 

that  matter,  any  2 and  2 ^ design),  provided  that  an 

initial  set  of  orthogonal  estimates  of  the  coefficients  is 
available.  This  means  that  i£  a screening  design  is  made  up 
of  blocks  of  Resolution  III  designs,  then  once  the  first 
block  has  been  completed  — enabling  the  coefficients  of  a 
first  order  polynomial  to  be  estimated  — a new  equation 
can  be  determined  relatively  quickly  after  data  has  been 
collected  at  a new  data  point.  The  predictor-corrector 
equation  provides  an  exact  least  squares  estimate,  an  update, 
of  all  the  coefficients  without  elaborate  calculations  or 
the  need  for  a high-speed  computer. 
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REQUIREMENTS  FOR  USING  THE  P-C  EQUATION 

Two  conditions  must  be  satisfied  before  the  equation 
can  be  used; 

1)  The  estimated  coefficients  from  at  least  a single 
Resolution  III  block  must  be  available.  More,  or 
higher  resolution  blocks  are  acceptable. 

2)  The  rows  of  the  new  data  points  must  be  orthogonal. 
That  means  that  the  sum  of  the  cross  products 
between  adjacent  coefficients  (i.e.,  plus  and  minus 
ones)  of  the  sign  matrix  making  up  any  two  rows 
must  equal  zero. 

Both  conditions  are  met  in  a 2^”^  screening  design  made  up 
of  two  Resolution  III  blocks.  They  would  also  be  met  if 
one  Resolution  IV  design,  to  represent  the  initial  block, 
had  been  completed  and  was  in  the  process  of  being  replica- 
ted, or  a new  plan  begun. 


PREDICTOR-CORRECTION  EQUATION 


The  P-C  equation  provided  by  Hunter  (1964,  p 43)  is; 


where : 


q 

m 


<*i 


1 

mN  + q 


ri 


* 


number  of  coefficients  in  the  model1,  q - N 

number  of  blocks  of  N conditions  already 
completed 

number  of  conditions  per  complete  block 
row  vector  of  coefficients  (i.e.,  + 1)  of 


* 

Italicized  letters  are  matrix  symbols. 
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€ 


C 


c 


c 


k 


BsssHasgasgaa? 


independent  variables  associated  with 
the  ith  experimental  condition 

Y^  » new  performance  score  associated  with 

A 

Y.  » predicted  performance  score  associated  with 
itb  observation 


The  correction  constants,  for  the  ith  condition  is 
combined  with  the  coefficients  (B)  from  the  previous  block 
to  get  the  revised  coefficients  (B*) , thus: 

n 

B*  » B + l 


The  variance  of  each  coefficient  is  calculated: 


_1 

Variance  (b*)  * mN 


EXAMPLE 

How  the  equation  is  used  can  best  be  explained  by  means 
of  an  illustration.  Fictitious  data  for  a 2 2+1  fractional 
factorial  experiment  with  8 observations  is  given  in  Table 
XVII.  Eight  observations  enables  two  Resolution  III  blocks 
of  data  to  be  collected.  We  will  presume  that  the  first 
block  was  run  and  the  coefficients  for  the  linear  terms  were 
calculated.  We  will  use  the  predictor-corrector  equation 
to  obtain  the  least  squares  equation  after  the  results  from 
the  5th  and  6th  data  points  are  each  obtained.  The  proce- 
dure for  calculating  the  new  coefficients  after  each  new 
experimental  condition  has  been  completed  is  as  follows: 


( , 


i 


'i 

k 


\ 


i 


I 

| 


1.  Calculate  the  q coefficients  from  the  N experimental 
conditions  in  Block  I.  Yates'  algorithm  can  be  used 
to  obtain  the  ef fects-total,  which  are  divided  by  N 
to  obtain  the  coefficients.  The  first  four  (N) 
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TABLE  17 

IMAGINARY  DATA  WITH  WHICH  TO  ILLUSTRATE 
AN  INCOMPLETE  ANALYSIS 


1 

Block  I 2 
(I=»ABC)  3 
4 


Exptl . Condition 
c 

a 

b 

abc 


(I) 


+ 

- + 

+ + + 


Performance 

1.3 

3.6 

2.4 

1.7 


5 

ab 

+ 

+ 

+ 

- 

2.5 

Block  II 

6 

be 

+ 

- 

+ 

+ 

1.5 

(I-ABC) 

7 

ac 

+ 

+ 

- 

+ 

2.8 

8 

(1) 

+ 

— 

— 

— 

3.4 

9 


c 


+ 1.2 


TABLE  18 

WORKING  DATA  TO  OBTAIN  UPDATED  EQUATIONS 


Exptl.  cond.  #6 


1.5 


1.15 


.0438 


Coef.  1+5 
Coef.  1+6 


2.519  .419 

2.544  .356 


-.239  -.731 

-.156  -.706 
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experimental  conditions  in  Table  17  make  up 
Block  1 and  the  four  (q)  coefficients  for  this 
data  are  shown  in  Table  18  at  [a ] . 


2.  Solve  for  the  denominator  of  d^,  the  correction 
constants  t 


d. 


1 


mN  + q 


!Yi  - 


V-i 


In  this  example, 

m « 1 block  already  completed 

N « 4 conditions  in  the  complete  block 

q " 4 coefficients  in  the  model 
(including  mean) 

Therefore,  the  P-C  equation  for  this  problem  reduces 
to: 


JL  “ (TY~4T~+T~  (V  VL 


a.  - (Yi  - Y1)  *’i 
1 8 


Determine  the  estimated  performance  for  the  now 
\ 

data  point,  Y.  Thi « is  the  sum  of  the  cross 
products  between  the  Block  1 coefficients  and 
corresponding  t.l  coefficients  of  the  now  data 
point.  Include  the  plus  and  minus  signs  in  this 
operation. 


: 1 1 ^vj  5H*P? 3?  A.!??; 


For  example#  in  Table  17  to  obtain  the  estimated 
performance  for  experimental  condition  #5#  the 
following  steps  are  performed: 


I Block 

It 

+2.50 

+ .40 

-.20 

-.75 

Coefficients  < 

r 

Exptl , 
Cond. 

#5: 

+1 

+1 

-1 

-1 

Y5  - +(+2.50)  +(+.40)  -(-.20)  -(-.75)  * 2.35 


This  value  is  located  in  Table  18  at  UTi . 


4.  Calculate  the  correction  constant,  by  subtracting 

A 

the  estimated  performance,  Y^,  for  experimental 
condition  i (such  as  the  one  just  calculated  for 
experimental  condition  #5)  from  the  actual  perfor- 
mance, Y^,  (found  in  Table  17  and  located  in 
Table  18  at  [c]) . Divide  this  difference  by 
the  denominator  of  d. , which  was  calculated  in 
step  2: 


y * 2.5 
- i = 2.35 

.15  divided  Dy  8 = .01875  = +.019 

which  is  the  correction  constant  for  experimental 
condition  #5. 

5.  Add  this  correction  constant  (using  the  sign 
vector  of  the  particular  experimental  condition) 
to  the  corresponding  coefficients  from  the  previous 
estimate  to  obtain  the  new  estimates.  These  are 
the  coefficients  for  the  new  fitted  equation. 


Continuing  with  our  example: 


Coefficients  of  previous  equation 
(Block  I) 

Constant  w/signs  of  coefficients 
of  experimental  condition  #5 

New  equation i combined  data  from 
Block  I and  experimental 
Condition  #5 


2.50  +.40  -.20  -.75 

+.019  +.019  -.019  +.019 

2.5194 . 419A- . 219B- . 731C 


6.  The  procedure  would  be  repeated  when  performance  for 
a new  data  point  (#6)  is  obtained.  The  estimated 

A 

performance,  Y,  is  still  obtained  using  the  co- 
efficients from  Block  I.  The  coefficients  for  the 
new  equation,  however,  are  obtained  by  adding  the 
new  correction  constant^  multiplied  by  the  coeffi- 
cient of  the  corresponding  columns  of  experimental 
condition  #6,  to  the  corresponding  coefficients  of 
the  previous  equation  derived  by  combining  Block  I 
and  experimental  condition  15. 

Coefficients 

l^Exptl.  Cond.  #t> 

A 

Y = 

Y =» 

■\ 

(Y  - Y)/8  = 

Equation  1+5. 

Constant  w/ 
signs  #6 

New  equation 
for  combined 
data  from 
block  #1  and 
exptl.  cond. 

#5  and  #6 

This  procedure  continues  as  each  new  data  point  is 
added. 


2.50  .40 

+ 1 -1 

-.20 

+1 

-.75 

+1 

2.50  -.40 

-.20 

-.75  * 1.15 

1.5 

1.5  - 1.15  - 

.35/8 

=+.04375  Correc- 
tion Constant 

2.509  .409 

-.207 

-.741 

+.044  -.044 

» .044 

+ .044 

A 

Y = 2.5  + .3A  - .IB  - .7C 
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If  it  is  not  necessary  to  estimate  the  equation  each 
time  a new  data  point  is  added,  the  correction  constants 
along  with  the  appropriate  signs  for  the  specific  experimen- 
tal conditions  can  be  summed  together  and  added  to  the 
original  block  coefficients.  For  example,  after  both  experi- 
mental conditions  #5  and  #6  have  been  taken,  the  new  coef- 
ficient for  Factor  A in  the  above  example  would  be: 

Coeff.  from  Original  Block:  .40 

#5  constant:  +(+.01875) 

#6  constant:  -(+.04375) 

New  coefficient:  .025  (Factor  A) 

Computations  can  be  made  more  easily  when  many  data  points  are 
to  be  added  if  a tab  with  the  list  of  correction  factors  (with 
signs)  listed  on  it  is  laid  next  to  each  sign  column  and 
added  or  subtracted  accordingly. 

If  a second  block  of  N = q experimental  conditions  is 
run  — in  this  example,  eight  more  — further  revisions  of 
the  equation  would  be  based  on  the  coefficients  derived  from 
the  data  from  both  blocks.  This  would  also  require  a change 
in  the  denominator  of  the  correction  constant: 

• If  the  number  of  coefficients  to  be  estimated 
continued  to  be  4 (q) , then  since  there  are  now 
2 (m)  blocks  completed  with  4 (N)  conditions  per 
block,  the  denominator  of  the  correction  constant 
would  be:  (2x4)  +4  =12 

• If  the  number  of  coefficients,  q,  including  the 
mean,  is  expanded  to  8 (which  is  possible  with 
8 independent  observations) , this  would  make 


183 


the  block  size,  N,  equal  to  8,  now  to  be  considered 
a single  block.  The  denominator  of  the  correction 
constant  would  be:  (1  x 8)  + 8 = 16. 

Remember,  all  estimates  are  based  on  the  data  from  the  most 
recently  completed  block  of  a size  capable  of  estimating  all 
the  coefficients. 

MISSING  DATA 

At  first  glance  it  would  appear  that  this  process  could 
be  used  to  fill  in  missing  data.  For  example,  if  all  data 
points  of  the  first  block  and  all  but  one  somewhere  in  the 
second  block  were  completed,  then  a least  squares  fit  of 
the  available  data  made  by  using  the  P-C  equation  could  be 
used  to  predict  performance  in  the  missing  cell.  In  theory, 

k-n 

this  is  true.  In  practice,  for  any  cell  of  a 2 ^ design, 

the  equation  obtained  from  the  Block  I data  would  provide 
the  same  estimate  of  a missing  performance  value  at  a point 
within  the  experimental  design  as  would  an  equation  derived 
after  the  data  from  the  incomplete  block  has  been  added  to 
that  of  the  iirst  block.  This  anomoly  occurs  because  each 
condition  in  the  new  block  is  orthogonal  to  the  first  block 
and  therefore  does  not  affect  the  original  estimates. 

However,  the  equation  based  on  the  old  block  data  plus 
the  data  from  the  new  incomplete  block  will  provide  better 
estimates  of  data  points  anywhere  in  the  experimental  space 
except  those  that  a_re  a £art  of  the  experimental  design. 
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APPENDIX 


APPENDICES 


I.  SUPPORT  DATA  FOR  2 SCREENING  DESIGNS,  N = 16 
Three-factor  interaction  strings  aliased  to  main  effects  . • . 


Two-factor  interactions  aliased  in  strings 


Inner-product  sums 


APPENDIX  7i.  DATA  FOR  Z16^11  SCREENING  DESIGNS,  N * 32 

A.  Screening  design  (sign  matrix,  percent  overlap,  factor-level 
change  count,  old  factor  labels,  new  screening  labels)  .... 

B.  Three-factor  interaction  strings  aliased  to  main  effects  . . . 


Two-factor  interactions  aliased  in  strings 


Inner-product  sums 


APPENDIX  III.  DATA  FOR  232^16  SCREENING  DESIGNS,  N * 64 

A.  Screening  design  (sign  matrix,  percent  overlap,  factor-level 
change  count,  old  factor  labels,  new  screening  labels)  .... 

B.  Three-factor  interaction  strings  aliased  to  main  effects  . . . 


Two-factor  interactions  aliased  in  strings 


Inner-product  sums 


APPENDIX  IV.  COMPUTER  PROGRAM  FOR  OBTAINING  SCREENING  DESIGN  ALIASES  . 
APPENDIX  V.  PROBABILITY  VALUES  FOR  CONSTRUCTING  HALF-NORMAL  GRIDS  . . 


APPENDIX 


APPENDIX 


VI.  DERIVATION  OF  COMBINED  LINEAR  AND  CUBIC  TREND-ADJUSTMENT 

EQUATIONS  

VII.  HOW  TO  CALCULATE  DETERMINANTS  FOR  ANALYSES  FOR  TWO 

AND  THREE  RESPONSE  DESIGNS  


APPENDIX  VIII.  ZAHN'S  GUARDRAILS  FOR  HALF-NORMAL  PLOTS 
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Original  Factorial  Labels 

Throe-Factor  Interaction 
Strings 
Aliased  with 
Main  Effect 


New  Factor  Main  Effects 


AD 

ACD 

AC 

ABD 

ABCD 

ABC 

AB 

A 

ABH 

ABF 

ABD 

ABE 

BCE 

ACE 

ABC 

ABG 

ACD 

ACG 

ACH 

ADG 

BDF 

ADF 

ADH 

ACF 

AEF 

AEH 

AEG 

AFH 

BGH 

AGH 

AFG 

ADE 

BCF 

BCH 

BCG 

BDH 

CDG 

CDH 

BDG 

BCD 

BDF 

BEG 

BEH 

BFG 

CFH 

CFG 

BFH 

BEF 

CEH 

CEF 

CDE 

DEF 

DEH 

DEG 

CDF 

CEG 

DFH 

FGH 

UGH 

EGH 

EFG 

EFH 

CGH 

DFG 

G 

D 

F 

C 

A 

B 

E 

H 

APPENDIX  I-B 

TWO-FACTOR  INTERACTIONS  ALIASED  IN  STRINGS 


Original  Factorial  Labels 

Two-Factor  Interaction 
< Strings 

Aliased  with 
Main  Effoct 


D 

CD 

C 

BD 

BCD 

BC 

B 

AB 

AE 

AC 

AF 

AH 

AG 

AD 

CE 

BC 

BE 

BD 

BG 

BH 

BF 

DF 

OH 

DC. 

CH 

CF 

CD 

CC. 

GH 

FG 

FH 

EG 

DF. 

EF 

EH 

Where  cell  has  been 


APPENDIX  II. 


DATA  FOR  216i"J‘1  SCREENING  DESIGNS,  N = 32 


21®"11  SCREENING  DESIGN' 


EXPERIMENTAL 

CONDITION 


BCOELMNO 
AFGHIJKP 
AEFGHMN0 
BCDIJKLP 
ADFIJLNO 
BCEGHKMP 
BCGHI JNO 
ADEFKLHP 
ACGIKLM0 
BDEFHJNP 
BDFHIKMO 
ACEGJLNP 
BEFGJKLO 
ACDHIMNP 
ACDEHJKO 
BFGILMNP 
ABHJKLMN 
CDEFGIOP 
CDFGJKMN 
ABEHILOP 
CEFHIKLN 
ABDGJHOP 
ABDEGIKN 
CFHJLMOP 
DEGHIJLM 
ABCFKHOP 
ABCEFIJM 
DGHKLNOP 
ABCDFGHL 
EIJKMNO 


APPENDIX  II-A 


NEW  SCREENING 


(Main  Effects)* 
(I)  A B C D 


S & 

c->  60 

t=J  Q 

m o 


S > 

co  co 

O C3 

m m 


£ 3*  > 

co  eo  n 

o m a 


j 

O I 

m 

PERCENT# 

Linear 

TREND/EFFECT 

Quadratic 

OVERLAP*** 

Cubic 

0 0 0 0 


BSHB 


FACTOR  LEVEL  CHANGE  COUNT  1 Q | 21  20  I 22  18  26  i 25  1 19  |17  1 27  25  1 29  16  J2A  28 

*Three-factor  interaction  strings  aliased  with  main  effects  are  listed  in  Appendix  II-B 
**Two-factor  interaction  strings  aliased  with  two-factor  interaction  labels  listed  in  Appendix  II-C 
•‘Inner-product  sums  listed  in  Appendix  II-D 

//Blank  spaces  represent  zero  percent,  Spaces  with  zeroes  in  them  represent  some  percent  smaller  than  IX 
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15  117  | 27  1 25  1 29  16  [24  28  1 30 

ted  in  Appendix  II-B 

LABELS  LISTED  IN  APPENDIX  II-C 
EPRESENT  SOME  PERCENT  SMALLER  THAN  IX 


31  1 10  11 


2 10  39 


9 13  5 I 8 Il2  14 


A 18  71 


3 9 


4 6 2 15  7 


m 

19 

75 

■ 

■ 

21 

11 

3 

1 

A1»!T.K01X  I I-B 

THREE-FACTOR  INTERACT I ON  STRINCS  ALIASED  TO  MAIN  EFFECTS 


NEW 

MAIN 

EFFECTS 


ORIGINAL 

FACTOR 

LABELS 


ALIASED  THREE-FACTOR  INTERACTION  STRINGS 


0 

K 

N 

J 

F 

L 

P 

E 

I 

M 

H 

D 

C 

A 

B 


AE 

A BP 

ACK 

ADJ 

AEH 

AFN 

ACM 

AIL 

CFP 

CGI 

CHJ 

CLM 

DPI 

DCF 

DHK 

HLP 

1JN 

IKK 

JVP 

KNP 

ADE 

ABN 

ACO 

ADE 

AFP 

AG1 

AHJ 

ALM 

CFN 

CGM 

C1L 

DFM 

DON 

DMO 

DLP 

HLN 

IJP 

IKO 

JtMN 

NOP 

AD 

ABK 

ACP 

ADI 

AEG 

AFG 

AHK 

AJL 

CFK 

CGJ 

CHI 

DFJ 

DGK 

DHP 

DU) 

HKL 

IJO 

IKP 

JKK 

KOP 

ACE 

A By. 

ACE 

ADO 

AF1 

AGP 

AHK 

ALN 

CGN 

CHO 

CLP 

DEH 

DFN 

DGM 

' D1L 

HLK 

IKP 

INO 

KKU 

MOP 

ABC 

ABC 

ADL 

AEK. 

agh 

All 

AKP 

ANG 

CHL 

CJM 

CKN 

CoP 

DEP 

DIO 

DJN 

HJP 

HKO 

ILN 

JLO 

LKP 

AB 

AliK 

ACG 

ADF 

aep 

AIO 

AJK 

AXK 

CIK 

CJP 

CKO 

DEM 

DGH 

DU 

DKP 

GKO 

GNP 

HJK. 

HKN 

Hop 

A 

ABO 

ACN 

ADM 

AEL 

AFK 

AGJ 

AHI 

CFO 

CHK 

CJL 

DEF 

DGO 

DHN 

DKL 

HLO 

1JK 

1K.N 

J MO 

KNO 

ACDE 

ABI 

ACJ 

ADK 

AFK 

AGN 

AHO 

ALP 

CGP 

CHK 

CLN 

DFP 

DGI 

DHJ 

du<: 

IOP 

JKO 

JNP 

kmp 

KNO 

ACD 

ABE 

ACV 

ADK 

AFJ 

AGK 

AHP 

ALO 

CGO 

CHN 

CKL 

DEG 

DFO 

DHM 

DJL 

GLM 

CKP 

JNO 

KKO 

MNP 

AC 

ABJ 

AC  I 

ADP 

AEF 

AGO 

AHN 

AKL 

CCK 

CHP 

CLO 

DEL 

DFK 

DGJ 

DHI 

HJL 

iKO 

INP 

JKN 

Jop 

ABE 

ABL 

ACD 

AEG 

AFG 

AIP 

AJK 

AMN 

CIN 

CJO 

CKP 

DEJ 

DGL 

DIM 

DKO 

CKP 

GNO 

JLK 

KLN 

LOP 

ABDE 

ABG 

ACH 

AEK 

AFL 

AIN 

AJO 

AMP 

CIP 

CJK 

CMN 

EFP 

EG  I 

EMJ 

ELK 

HKP 

HNO 

ILM 

KLO 

LNP 

A BCE 

ABF 

ADH 

AEJ 

AGL 

AIM 

AKO 

ANP 

DIP 

DJK 

DKN 

EFI 

EGP 

EHK 

ELN 

HJO 

HKP 

IKL 

JLP 

LKO 

ABCDE 

BCF 

BDG 

BE  I 

BHL 

BJM 

KlN 

BOP 

DIN 

DJO 

DKP 

EFM 

EGN 

EHU 

ELP 

HJK 

HKN 

ILO 

JLN 

KLK 

ABCD 

ACF 

ADO 

AE1 

AHL 

AJK 

AKN 

AOP 

D1K 

DJP 

DMO 

EFJ 

EGK 

EliP 

ELO 

HJN 

HKM 

1LP 

JftL 

i.:.,N 

ABD 

ABD 

ACD 

afh 

AFH 

AIK 

AIP 

AKO 

CIO 

CJN 

CKF 

DEI 

DHL 

DJ!1 

DKN 

HKP 

KUO 

ILM 

KLO 

LNP 

BCN 

BDM 

BEL 

BFK 

BGJ 

BHI 

CDE 

GKL 

DLN 

EFG 

EIP 

EJK 

EMN 

FHK 

FJL 

GHN 

BCP 

BD1 

3EG 

BFO 

BHK 

BJL 

CDJ 

CEH 

EFL 

E1N 

EJO 

EKP 

FGJ 

FHI 

GHP 

GLO 

BCO 

BDE 

BFP 

BGI 

BHJ 

BLM 

CDM 

CEL 

EFH 

E1K 

EJP 

EKO 

FGK 

P1L 

GHO 

GLP 

BC1 

BDP 

BEP 

BGO 

BHN 

BKL 

CDK 

CFK 

EGL 

E1M. 

EKO 

ENP 

FGK 

’ FHP 

FLO 

GHI 

BDH 

BEJ 

BGL 

BIX 

BKO 

BNP 

CDG 

CE1 

DKN. 

ECO 

EHN 

EKL 

G1P 

GJK 

GKN 

HIK 

BCD 

BEO 

BFG 

BIP 

BJK 

BKN 

CEN 

CFH 

DNO 

EFK 
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APPENDIX  IV 

COMPUTER  PROGRAM  FOR  OBTAINING  SCREENING  DESIGN  ALIASES 


SIBI  /*  FROC.RAN  ALUS  •/ 

/•  WRITTEN  BY  HONARE  B.  IE!  «/ 

/•  THIS  COHPUTEN  PROGRAII  HAS  B!E1  WRITTEN  IB  Pl/1  FOR  A*  ION  360/91  •/ 

/•  CO1P0T*R,  •/ 

/*  fRCGF.AN  FOR  CONPUT I M C!  TWO  AH D T 113 KS  HAT  ALIASES  FOR  FRACTIONAL  V 
/*  FACTORIAL  CESIUNS  (SCB1ENING  CEStGNS)  . •/ 

1 Aircirfoc  cmcNS(NAiN)  j 

2 rCL  GCOG  rm  SIREAN  CUTPUT; 

3 DCL  P(32,8)  CIIAS(I)  ,DZ(d,J2)  Clt  AK  ( 1 ) , It  (12)  FIXED  El  H APT  ( 1 5, 0)  J 
« DCL  ZZ  CHAR  (6)  , A (32. H)  CHAH  ( 1)  . H F ( 32)  FIXED  BINARY  ( 15,0)  J 

5 LCt  (11(32)  FIX1D  tIN.AEt  (15,0)  ,pl  CHAR(2); 
t ICL  F(4961)  CIIAMS)  ,0C*°b1)  CHAR  (3)  ; 

7 CCl  LCC<4401)  FIXll  H l NARY (IS ,0) ; 

8 CCt  KCCI4961)  PIXEL  E 1 N A K Y (>1 , 0 ) i 

9 KIP*  15; 

/*  V 

/•  THIS  ROUTINE  IS  USEE  TO  COl'UT!  THE  ALIASES  FOR  EOIH  TWO  FACTO*  •/ 
/•  INTERACTIONS  AND  THREE  FACTOR  I ITER  ACT  IONS,  •/ 

/«  •/ 


10 

11 

12 

13 

14 

15 

16 


AllAS:PROC(N,(1X,R,NP,X,l,ll,NZ,r,XF,K8,TRIP)  I 
CCL  S (12,9)  C1IAR(1)  i 
CCl  X(*,*)  CHER(1)  , (•(♦,*)  CHAR  ( 1)  ; 

CCl  (N  (•)  , NF  ( •) ) tlXEt  BINARY  (15 ,0) ; 

CCl  (N.HE.l,  U.RT)  FIXIl  BIN  AS  Y ( 15, 0)  ; 

ECL  1!  CHAR  (1)  I 
CCl  ZE  CIIAR(I)  ; 

/•  V 

/•  CONFUTATIONS  TO  FIND  THE  TNO  FACTOR  INTERACTION  TERRS  */ 

CHFCKS  THE  UTTETS  CF  ON!  HIT  AGAINST  Till  OTHER.  NIIEN  THERE  IS  A •/ 
HATCH,  THE  PROCRAR  SKITS  TO  THE  NEST  LETTER  ARE  CHECKS  IT  AGAINST  ♦/ 
THE  LETTl'FS  OF  THE  SECOND  LIST.  IF  NO  KATCII  IS  FOUND,  IT  IS  STORE*/ 


F IX  Til’  A PI  AT  P.  1C  CHECK  FOR  Tilt  POSbIPLI.'T  THAT  A HATCH  HAY 
HOT  occur  WHEN  hatchnu  EACH  EUHENT  OF  THE  SECOND  LIST  AGAINST 
Til'1  FUST,  THE  SPA, 'CM  IS  r'.RFOR.ISD  IK  THE  OPPOSITE  CIRRECTION 


CO  J*1  TO  HE; 
0 TO  HELL; 
KP*XP*1; 


V 

•/ 

*/ 

*/ 


17 

KP  • 1 ! 

ia 

LCCrsOC  I«1  TC  N; 

20 

IF  X (1,1)  «I(LL,J) 

nisK 

22 

END;  f (KF,KP)*i(l, 

i) ; 

25 

hrill  END  LOOT! 

26 

FEVER :EC  1*1  TC  HE; 

28 

IF  X(ll,I).X(l,J) 

THEN 

31 

Mt*KP-1; 

32 

ZE*X  III. I) ; 

/♦ 

DO  J*1  TO  N; 


INC; 


/*  THOSE  LYTTEFS  THAT  LAVE  NO  NATCH  IN  EACH  LIST  ARE  SORTED  TO  APFIAR*/ 
/•  IN  A MIC!  .1ANX.-.R  . THESE  ARE  THE  FINISbSD  PRODUCT.  */ 

/♦  V 

33  LAPlTO  LK*1  TO  KPPi 

34  IE  2E  <T  (KR,KK)  TbIN  DO; 

J6  tt*P(KR ,KK)  ; 

37  P(KR,KK)»7.El 


38  Z!«TE; 

39  END;  ENU  LAI; 

41  F (K.1,  XP)  «ZE;  KP>  KP*  I;  IIBAVENtEND  NEVER; 

/•  CrRrill.ATIOKS  FOR  THE  TIINFI  FACTOR  INTERACTION  IENRS.  */ 

44  KP*  1; 

45  IF  TNI P*0  THEN  CC  TO  ZAP; 

/*  CH'CKS  THE  TWO  TERN  INTERACTIONS  AGAINST  A THINE  LIST  FO*  A RATCB  */ 
47  DC  K*tl*1  TC 
411  K*  * 1 ; 

4U  DC  1*1  TO  KP  • I j CO  .1-1  TO  .1  (X)  ; 

/•  IF  A NATCH  OCCURS,  SKIP  TO  THE  NEXT  LtTTEF  IN  TNE  LIST.  If  NO  •/ 
/•  HATCH  OCCURS,  THIN  ASSIGN  THAT  LETTER  TO  THE  A1RIY  S.  */ 

SI  IF  P(NS,I)»X(X,J)  THEN  GO  TO  tUS; 

53  END; 

54  S |K0,KE)  «P(KR,I)  ; 

5j  Kl*KE ♦ 1 ; 

56  IlIRiFRC; 
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t*  •/ 

/*  NEXT,  U f CHICK  fOS  NC  HATCH  Of  LETT  ESS  IN  THE  FEFERSE  ORDER  •/ 

/•  */ 

57  K?r t CO  I»1  TO  H(X) ; 

58  CC  J*1  TC  FP-1; 

59  If  X(K,T)*P(KF,J)  THIN  CO  TO  SHON; 

n fro; 

62  S (KO.Kf)  «X(K,I)  i 

63  K t*Kf  ♦ 1 ; 

6**  rfC*:*NP  REP; 
t>5  N F ( KB)  "KE-lj 

66  KF»  <E  ♦ 1 ; 

n7  INC; 

88  KF*  KP ; 

n9  CO  I « 1 TO  KO-1; 

70  S V-  1 ; 

/*  FORT  THE  LETTERS  IK  THE  THIRD  CFDER  INTERACTION  TIERS  INTO  ORDIB.*/ 

71  dc  w:iue(sh-.*/')  ; su*o ; 

71  DO  J»2  TO  NE  (I)  j 

74  ’ F 3(I,J)<S<I,J*1)  THIN  DO; 

76  _ U*S(I,J);  S<I,J)*S(I,J-1)  ; S (I . J - 1) -TS; 

74  Ska  1 j 

80  IRC; 

81  TNI); 

82  EH; 

/•  FLACF  THE  TIIIRO  ORCEE  INTERACTIONS  BACK  INTO  TNI  ARRAY  P.  •/ 

83  CC  J*  ' -C  NF  ( I)  ; 

84  P (I,  J)  *S  (I , J)  ; 

85  END ; 

86  END; 

87  CC  TC  MAFF; 

88  2AP:NF (KR) *KP-1j 

89  KF«Kf«1; 

90  HARP: END  ALIAS; 


/•  FND  Of  THE  StlUiOUTINf  ALTA'!,  •/ 

/•  Ff AC  T'  87.,  THE  NUHPEF  OF  LISTS  TC  !J E C08E1NIC  IN  TWO  AND  THREE  •/ 
/•  INTERACTIONS,  NeXT  READ  IN  flic  LENGTH  Of  THE  FIRST  LIST  AND  THEN*/ 
/♦  RE  AC  IN  THAT  LIST  INTO  THE  ARRAY  A,  THU  IS  fCLLOWED  DY  Till  V 
/*  MINFIN  Of  LFTTkFS  IN  Till  HEN  CODING  SCIIKNE,  NHICH  IN  TURN  IS  •/ 

/•  FOUONEO  BY  HIE  LIST  FCF  THE  KIN  CODINGS.  HIS  IS  REPEATED  FOB  •/ 

/•  AS  HAST  AS  INDICATE!  IN  87.  V 

91  GEI  EDIT  (87J  (COL  ( 1 ) , E (.')  | ; 

9?  DC  1*1  TO  87; 

41  Cc*.  "PIT  (N,  (A  (I.J)  DO  J«1  TO  N)  ,NA,  (U7(C,  I)  CO  J*1  TO  NA) ) 

(X  ( 1)  »F  ( 1) , Y (1) , IN)  A(1),X(t),I(1),E(1),tNA)  All)); 

94  f(T)«N; 

95  8f  (I)  • NA  ; 

96  END; 

97  DC  t F I P*3  TO  1; 

98  JJal; 

99  CC  L»1  TO  87; 

IOC  PC  1I«1  TO  87 ; 

101  KS*  1; 

102  If  l>«  tL  THEN  GO  TO  SEN; 

/•  CALL  THE  SUBPOlinNS  ALIAS  TO  CC8PUTE  THI  TNO  AND  THREE  NAY  •/ 

/♦  INTERACTION  TCFRS,  ' •/ 

104  CALL  ALIAS  (8(L)  ,H(LL)  , H,  N F,  A, !.,  II  ,87 , P,  KA,  KB  .TRIP)  ; 

105  KC*KR* 1 ; 

106  PC  1-1  TC  1C; 

/•  STORE  Til"  FINDINGS  IN  Til".  X AT  F IX  F , CONCAT ‘NAT  INC  EACH  LETTER  10  •/ 
/»  E0F8.  A NICE  STRING  Cl  CHARATLRS  TO  HE  OUTPUTTEC.  •/ 

/•  THIS  IS  EOF  111 X ORIGINAL  Of  QIC  CODING  SCHEHE  */ 

107  , F|JJ)«*  •; 

108  IE  NF  ( I)  • 1 THEN  R ( J J)  • P ( I , 1)  ; 

110  ELSE  If  Nr<D*2  THEN  F ( JJ ) -T  (1 , 1 ) It  P (1 , 2) ; 

112  ELSE  If  NP(I)*J  THIN  F (JJ)  -P  { 1 , 1)  1 1 P (1 , 2)  | | ? (1 , 1) ; 

114  FLSE  IF  NF(I)“4  Tll"N  F ( JC)  ‘ P (1 . 1)  | | P (T  , 2)  | ) P 1 1,  j)  1 1 P (I,  4)  ; 

116  ELSE  IF  Nf(  I)  *5  THEN  S(J.l)  -Pit,  1)  1 1 C (I  ,2)  ; IP;I,3))|P(I,4)||P{I,5)  ; 

118  ELSE  IF  NF  (I)  *6  THEN  0 (OJ)  *F  (I,  1)  | | p (I,  2)  It  T <1,  3)  1 1 P (I,  4)  | | P (1,5)  1 1 

P(1.6)  ; 
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120  JJ.JJPl; 

121  IND; 

122  StwsENC; 

121  !A0; 

12“  rk> i ; 

125  cc  1*1  TO  fiz ; 30  J*1  TO  NT;  IF  I>*J  THEN  CC  TO  JIN; 

/*  THf  SECOND  ORDER  INTERACTION  TERNS  FOR  TUt  HE <i  CODING  ARE  CONFUTED*/ 
121  If  1PIPO  TII’N  DO; 

121  IF  N <t)*2CHN(.l»*2  THEN  0 (KK)  » DA  ( 1 , I)|  | 81  (2, 1)1  I UT  1 1 ,J)  I t UZ  ( 2,  J)  ; 

121  ELSE  If  Nil  IT)  •ICFR(J)  *2  THEN  Q ( K K)  * BZ  ( I , I)  I I EZ  < 1 , J)  | | BZ  (2,  J)  ; 

125  elJE  IF  NN  (J)  *26Nf  (J)  *1  THEN  J(KK'*BZ(1,I)||8Z(J,1)||BZ(1,J); 

122  1 1 £S  Q(XK)  «EZ(1,I)  | |8Z(1,J|  ; 

111  NMXM1; 

HI  GC  TC  JIN; 

'“0  TKD; 

/•  IKS  THIRD  ORDER  INTERACTION  TERNS  FOB  NtH  CODING  SCHENK  ARE  */ 

/•  CF SATED  ,y 

INI  DC  K«  1 10  Nt;  ' 

1 42  IF  K<aI | K<«J  THEN  GO  TO  KIN; 

1“4  ir  NN  (I)  «2  THIN  30; 

i “6  rfozn.n  nnz(i,i) ; 

1 SR  INC; 

155  US’  TO; 

15"  *tH  iO|.26Hn(K»  ’•2  THEN  U (KK)  *BZ  ( 1, 1)  | | BZ  (1  ,J)  | | 82  (2,J)  | |B2  ( 1 ,10  || 

" “ i2 # K)  ; 

158  ‘“iaVW’  C THEN  0<NKJ*BZ(1,IJ||DZ|1,J)||BZ(1,KJ  II 

160  FtSS  IF  NN(J)*2  G NN(K)«*.  THEN  0 (AN)  »UZ  (1 , 1 )|  | BZ  ( 1 ,J)  1 1 BZ  (2,  J)  1 1 

bz  (i»  f| ; 

162  USE  C(KK)  "DC  (1 , 1)  | )UZ(1,J)  I | DA  ( 1 , K)  ; 

163  INC;  KK.KK.1;  K I F : l ; JINS  SK3;  IRC; 


ig*NZMNZ-i)/2; 

IF  TFIP-*«0  THIN  U*NZ*  INZ-1)*  (HZ*2)/G; 

/•  SOR~  Tlil  ALIASSS  IOf  THE  OLD  COOING  SCIIE.1E  INTC  ASCFNDING  ORDER  •/ 

/•  THIS  IS  DCS?  ONLY  If  Till  N'lHSRR  OF  ALIASES  ARE  LESS  THAN  1000.  •/ 

/«  WITH  NORF  THAN  1000  THE  CUNPUCtll  TINS  IS  TOO  CCSTLY.  FOR  THE  •/ 

/•  SIIUA.ION  WHERE  THE  MINDER  OF  AUISES  ARE  GPFATJS  THAN  1000,  THEY  •/ 
/•  API  OUTPUTTTC  TO  AN  LYTIPNAL  FILE  ON  DISK  ON  TAPE.  USING  IBN  SORT*/ 
/«  ROUTIN’,  WHICH  IS  NUCII  FASTER,  THE  AI.IAStS  ARE  SORTER  FDR  Thr  OLD  •/ 
/*  CODING  SOU  HE.  THEN  IN  ANOTHER  SHORT  PROGRAN,  CONTIN,  THE  SORTED  */ 
/*  ALIASES  APE  READ  BACK  INTO  THE  CONPUTEI  ANC  OUTfUTTED  IN  NICE  FORK*/ 
/«  AECNG  WITH  THE  NEW  CODING  SCHINE.  */ 

/•  IF  THE  NUKBIF  OF  ALIASES  ARE  LEXS  TUAN  1000,  Till  PROCESS  OR  SORTIN’/ 
/»G  AND  OUtrtitTING  ARE  AUTCHATCI.  •/ 

TF  iy>  1030  Til’S  GO  10  LCCP ; 

DC  .1*1  TO  IQ; 

ICC  ( J)  *J ; INC; 

Sk*  1 ; CO  WHILE  (SW*»3);  SN«3; 

CC  J* 2 10  IQ;  t*LOC(J);  Ll«tCC(0-1): 

IF  R(l)<P(LL)  THEN  DO;  IT:*LCC|J|;  ICC  (J) -IOC  (J-1)  ; 

LCC  ( J*  1)  * ITT ; S U « 1 ; LHD;  END;  INO; 

FC  .1*1  TO  LQ;  KCC(J)*tNDSX(P(J),»  »> -1 ; END; 

S V*  1 ; CO  WHILE  (SU-«0)  ; SN*T; 

CO  J « 2 TC  LQ;  l*LCC(G);  ll*LOC(J-1); 

IF  >OC(I)<SCC(Ll)  THEN  CO;  ITT*lOC|J); 

LCC  (.1)  * LCC  (J  - 1)  ; LCC  (J -1)  *ITT;  SW*1; 

INC;  END;  END; 

tut  ust  c •)  ; rui  skip(w)  ; 

IF  TR IP*C  THEN  GC  TC  NARS;  J R*  1 ; 

PCT  UIT(R  (LCC  (I))  ,Q(ICC(!)))  (CGt  ( I)  . A (6)  , Z (2)  , A (6)1  : 

CC  1*2  TC  Li,;  1N*LM1;  ;p  R (ICC  ( t)J -.*R  (LCC  II  - 1 ) ) THEN  DC- 
U*1:  PUT  ECIKRtLOCd)))  (SNlP(3).COL(i|,A(6h;  ESC; 

” ilt>'fflTirXy^0'*vn  ^IT(g(LOC(in,  <CCl(11),A(o),; 

l R*  1 ; <*o  ic  y;  end; 
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228  pot  ron  to  (toe  ti  j ) I J> . M«) ) ; 

224  TtlNO; 

220  POt  SKIP  (3)  j 
231  GC  TC  CQj 
212  COfRl 

put  ED  IT  ( * THP  DATA  tO  8*  SORTED  HAS  TOO  8UCB.  tt  IS  IttHG  RUTTER 
TO  AH  EXTKSNAl  It U •)  (SK IP (4)  ,COt ( 1)  , k)  J 
211  rot  POIT  (•  TO  BECOXHF  SORTED  DATA,  USE  IBR  SORT  lOUTINE  AHD  TRt*  TH* 

SHC  AT  t FOR  PAH  CCHTIH  TO  OUTPUT  PINAL  ALIAS')  tSKIR  (1)  ,COfc(1)  ,A)  i 

234  00  J*t  TO  LC; 

235  POT  rm(GGCG)  EDIT  (P(J)  .HOC  (J)  ,0  (J) ) (COL  (1) , A (6 ) , X (2) , P {«) , X ( 2)  , A (4) ) ; 

236  £ NO } 

237  GC  to  COi  HAKS: 

234  NSf  • NX/2 | NPA*U  LKK»1;  LLL*RIP»NSP;  RP»*3; 

243  DO  KAMI  TO  IQ  BX  ILL; 

244  FUt  SKIP{4)  ; PUT  lOIt(l(LOC(HPA)))  (COM3), A(6)); 

246  NrA*H!P*NPA; 

247  00  I* HP A TO  LC  FT  NSP; 

2«3  IKR*LKK*1;  IP  UK>K1F  THEN  CO; 

25i  *n«:;  IRK*  1;  00  TO  SEA; 

254  IHO; 

255  PUT  ECIT{8  (ICC  (I)  | ) (X  12} « A (6) ) ; 

256  tNOj 

257  SIA:  NX«KAR*N3t-1;  PUT  SKIP  (3); 

254  DO  J«K AK  TO  NX; 

2fr0  K tt- 1 i PUT  l DIT(U  (LOC  (J)))  (COL  (3),  A (6));  * 

262  DC  1*3* NSP  TO  LO  El  NSP;  ! 

2hJ  Klt«KU*1;  IP  KLl>KIP  THIN  DO;  XLL*  1;  GO  TO  ICC;  j 

268  end; 

264  PUT  1CIT(C(L0C(:>))  (X  (2)  ,A(6)J  ; 

270  2ND;  IOC ; K AC ; INC; 

273  0C  :ENC; 

274  END  ALPO; 


C 


C 


* 


c 


o 

Wk^.„ 
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i S S 8 8 1 

4 H H H H H 

3 3 3 3 3 3 


0 

«-< 

♦ 

a 

5 

i 

8 

J 

! 

8 

8 

H 

3 

3 

3 

3 

3 

3 3 
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^1 


i 


i 


THIS  PAGE  IS  BEST  QUALITY  PRACTICABLE 
KdOU  COPY  FURNISHED  TODD.Q  i*'***’' 


M nI  X M O 


O'  M «|  H » ► 


SSUnnu 


m 


NO  O' 

»o  co 

>t»  f*» 
so  vO 
NO  *0 

ss 

NO  M 


fs. 

W>  NO 

*n 


m N 
*A  H 
♦A  O 


•*  o> 

nO  nO 

•«»  nO 

«<  ># 


n n» 
m rt 
n n 
r>  *-t 
n o 

IN  O' 

r*  co 

in  r*<. 

as 

as 

M IN 
IN  H 

IN  O 
*“•  O' 

f-i  to 


H •© 
«-»  «0 

as 


n 

* 

3 

n 

M 


M 


9 


3 

n 
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APPENDIX  V 


i 

! 

'f 

PROBABILITY  VALUES 

FOR  CONSTRUCTING  HALF 

-NORMAL 

GRIDS 

t 

f 

t 

i 

y 

j 

The  table  below  provides  the  probability  values  at  which 

the  first  four 

largest  effects  would  be  plotted 

on  grids 

for 

grids  with  from 

63  to 

8 ranks 

• 

£ 

c 

f 

For 

each  set 

the  number  in  parentheses  is 

the 

rani;,  R 

, where 

!;• 

1 

l 

i 

R * 

X 

.683  Y 

+ 0.5 

(Y  * 

largest  rank;  also 

N-l) 

i 

representing  the  estimated 

standard  deviation 

for  a Y- 

size  grid  (Daniel,  1959, 

1 

p 322).  The  relationship  between 

P,  the  probability  value  on 

normal  probabil- 

;< 

ity  paper,  and  P',  the  new  probability  values 

for  the 

half-normal  grid,  is 

| 

explained  on 

page  85 

in  the  text. 

Grids  can  be  properly  spaced  by  relating  the 

1 i 

original  P values  to 

their  corresponding  Z 

-values  (where  0 = 

1)  found  in  most 

| 

f 

1 

?' 

h 

> 

j* : 

normal  distribution 

tables 

RANK 

P' 

P 

RANK 

P' 

P 

RANK 

P' 

P 

RANK 

P' 

P 

i 

63 

99.21 

99.60 

62 

99.19 

99.60 

61 

99.18 

99.59 

60 

99.17 

99.58 

f? 

L 

i 

62 

97.62 

98.81 

61 

97 . 58 

98.79 

60 

97.54 

98.77 

59 

97.50 

98.75 

P 

& 

; 

61 

96.03 

98.02 

60 

95.97 

97.98 

59 

95.90 

97.95 

58 

95.83 

97.92 

s 

60 

94.44 

97.22 

59 

94.35 

97.18 

58 

94.26 

97.13 

57 

94.17 

97.08 

(44) 

(43) 

(42) 

(41) 

: { 

59 

99.15 

99.58 

58 

99.14 

99.57 

57 

■"99aT 

99.56 

56 

99.11 

99.55 

1 

58 

97.46 

98.73 

57 

97.41 

98.71 

56 

97.37 

98.68 

55 

97.32 

98.66 

l£ 

57 

95.76 

97.88 

56 

95.69 

97.84 

55 

95.61 

97.81 

54 

95.54 

97.77 

? 

(! 

; 

56 

94.07 

97.03 

55 

93.97 

96.98 

54 

93.86 

96.93 

53 

93.75 

96.88 

sc 

' 

(41) 

(40) 

(39) 

139) 

1 

55 

99.09 

99.55 

54 

99.07 

99.54 

53 

99.06 

99.53 

52 

99.04 

99.52 

1 

1 

fc 

; i. 

54 

97.27 

98.64 

53 

97.22 

98.61 

52 

97.17 

98.58 

51 

97.12 

98.56 

53 

95.45 

97.73 

52 

95.37 

97.69 

51 

95.28 

97.64 

50 

95.19 

97.60 

! 

52 

93.64 

96.82 

51 

93.52 

96.76 

50 

93.40 

96.70 

49 

93.27 

96.63 

1 

(38) 

(37) 

(37) 

(36) 

51 

99.02 

99.51 

50 

99.00 

99.50 

49 

98.98 

99.49 

48 

98.96 

99.48 

fr. 

50 

97.06 

98.53 

49 

97.00 

98.50 

48 

96.94 

98.47 

47 

96.88 

98.44 

i 

L 

; S’. 

49 

95.10 

97.55 

48 

95.00 

97.50 

47 

94 . 90 

97.45 

46 

94.79 

97.40 

5 

! 

48 

93.14 

96.57 

47 

93.00 

96.50 

46 

92.86 

96.43 

45 

92.71 

96.35 

r 

(35) 

(35) 

(34) 

(33) 

t 

47 

98.94 

99.47 

46 

98.91 

99.46 

45 

98.89 

99.44 

44 

98.86 

99.43 

1 

46 

96.81 

98.40 

45 

96.74 

98.37 

44 

96.67 

98.33 

43 

96.59 

98.30 

1. 

45 

94.68 

97.34 

44 

94.56 

97.28 

43 

94.44 

97.22 

42 

94.32 

97.16 

i 

■ ( 

44 

92.55 

96.28 

43 

92.39 

96.20 

42 

92.22 

96.11 

41 

92.05 

96.02 

I 

(33) 

(32) 

• 

(31) 

(31) 

§ 

43 

98.84 

99.42 

42 

90.81 

99.40 

41 

98.78 

99.39 

40 

98.75 

99.38 

if 

42 

96.51 

98.26 

41 

96.43 

98.24 

40 

96.34 

98.17 

39 

96.25 

98.12 

i 

41 

94.19 

97.09 

40 

94.05 

87.02 

39 

93.90 

96.95 

38 

93.75 

96.88 

i 

40 

91.86 

95.93 

39 

91.67 

95.83 

38 

91.46 

95.73 

37 

91.25 

95.62 

i" 

r 

% 

| 

• 

(30) 

(29) 

(29) 

(28) 

| 

f 

( 

i 

i 
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PROBABILITY  VALUES  FOR  CONSTRUCTING  HALF-NORMAL  GRIDS  (Continued) 


RANK  P' 


RANK  P' 


RANK  P' 


98.72  99.36 

96.15  98.08 

93.59  96.79 

91.03  95.51 


98.68  99.34 
96.05  98.03 
93.42  96.71 
90.79  95.39 


98.65  99.32 
95.95  97.97 
93.24  96.62 
90.54  95.27 


98.61  99.31 
95.83  97.92 
93.06  96.53 
90.28  95.14 


98.57  99.29 
95.71  97.86 
92.86  96.43 
90.00  95.00 


98.53  99.26 

95.59  97.79 

92.65  96.32 

89.71  94.85 


98.48  99.24 
95.45  97.73 
92.42  96.21 
89.39  94.70 


98.44  99.22 
95.31  97.66 
92.19  96.09 
89.06  94.53 


98.39  99.19 
95.16  97.58 
91.94  95.97 
88.71  94.35 


98.33  99.17 

95.00  97.50 

91.67  95.83 

88.33  94.17 


98.28  99.14 
94.83  97.41 
91.38  95.69 
87.93  93.97 


98.21  99.11 
94.64  97.32 
91.07  95.54 
87.50  93.75 


98.15  99.07 
94.44  97.22 
90.74  95.37 
87.04  93.52 


98.08  99.04 
94.23  97.12 
90.38  95.19 
86.54  93.27 


98.00  99.00 

94.00  97.00 

90.00  95.00 

86.00  93.00 


97.92  98.96 

93,75  96.88 

89.58  94.79 

85.42  92.71 


97.83  98.91 
93.48  96.74 
89.13  94.57 
84.78  92.39 


97.73  98.36 
93.18  96.59 
88.64  94.32 
84.09  92.05 


97.62  98.81 
92.86  96.43 
88.10  94.05 
83.33  91.67 


97.50  98.75 

92.50  96.25 

87.50  93.75 

82.50  91.25 


97.37  98.68 
92.11  96.05 
86.84  93.42 
81.58  9C  79 


97.22  96.61 
91.67  95.83 
86.11  93.06 
80,56  90.28 


97.06  98.53 
91.18  95.59 
85.29  92.65 
79.41  89.71 


96.88  98.44 
90.62  95.31 
84.38  92,19 
78.12  89.06 


96.67  98.33 

90.00  95.00 

83.33  91,67 

76.67  88.33 


96.43  98.21 
89.29  94.64 
82.14  91.07 
75.00  87.50 


96.15  98.08 
38.46  94.23 
80.77  90.30 
73.08  86.54 


95.83  97.92 

87.50  93.75 

79.17  89.58 

70.83  85.42 


95.45  97.73 

86.36  93.18 

72.27  88.64 

68.18  84.09 


95.00  97.50 

85.00  92.50 

75.00  87.50 

65.00  82.50 


94.44  97.22 
83.33  91.67 
72.22  86.11 
61.00  80.56 


93.75  96.88 

81.25  90.62 

68.75  84.33 

56.25  78.12 


92.86  96.43 

78.57  89.29 

64.29  82.14 

50.00  75.00 


91.67  95.83 

75.00  87.50 

58.33  79.17 

41.67  70.83 


90.00  95.00 

70.00  85.00 

50.00  75.00 

30.00  65.00 


87.50  93.75 

62.50  81.25 

37.50  68.75 

12.50  56.25 
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APPENDIX  VI 

DERIVATION  OF  COMBINED  LINEAR  AND  CUBIC 
TREND-ADJUSTMENT  EQUATIONS 

Dr.  Steve  R.  Webb 

The  following  derivation  parallels  the  ones  used  to 
obtain  the  linear  and  quadratic  trend-correction  equations 
described  by  Daniel  and  Wilcoxin  (1966,  pp  272-273)  . 


1.  Normal  equations  for  ordered  2-  plans  to  correct  for 
linear  (L)  and  cubic  (K)  trends. 

?•£  + xx  + xx  + yy  + zz  + = (L)  (l.l) 

kK  + x'X  + y'Y  + z'Z  + = (K)  (1.2) 

x'K  + x£  + NX  = (X)  (1.3) 

y'K  + yL  + NY  = (Y)  (1.4) 

z'K  + zL  + nZ  = (Z)  (1.5) 

etc. 

where  1 - [LL],  x = [LX],  y = [LY],  z = [LZ] 
k = [KK],  x ' — [KX],  y'  = [KY],  z'  = [KZ] 

The  meaning  of  the  alternate  symbols  can  be  found  in  Table  15  in 
the  text.  N = 2P  and  (X),  (Y)  , (Z)  are  the  contrasts  correlated 
with  (L)  and  (K) . A dot  over  a letter  indicates  it  is  an  unknown 
term. 

2.  From  equations  (1.3),  '1.4)  and  (1.5)  we  can  obtain 


NX  = (X) 

- xL  - 

x'K 

(1.6) 

NY  = (Y) 

- yL  - 

y'K 

(1.7) 

NZ  ~ (Z) 

- zL  - 

Z'K 

(1.8) 

etc. 

Substituting 

these 

equations  into 

(1.1) 

and  (1. 

2) 

we 

obtain: 

(NX, 

2 2 2 
- x-  y-z 

/s 

-..  .)L 

+ (-xx' 

/V 

-yy '-•••) k = 

N (L)  - 

x(X) -y (Y) 

... (1.9) 

(-x 

'x-y’y. . .)L 

+ (Nk-> 

2— y ' z— 

. . . ) K = N (K) 

-x’ (X) 

-y* (Y) ... 

..(1.10) 

With  the  solutions  for  L and  K in  terms  of  the  obser- 
vations and  the  design  parameters,  we  can  evaluate  the 
regression  coefficients  directly  from  equations  (1.6) 
to  (1.8). 

Equations  1.9  and  1.10  are  written  using  the  alternate 
symbols  in  Table  15,  Equations  Ilia  and  b,  in  the  text. 
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APPENDIX  VII 

CALCULATING  DETERMINANTS 


Determinants  for  a 2 x 2 matrix  as  shown  in  this 
illustration  are  easy  to  calculate.  For  example,  if  the 
elements  of  the  matrix  were: 

(:  :) 

then  the  determinant  of  the  matrix  (indicated  by  the 
vertical  lines) , is: 


l) 


a b 
c d 


(ad  - be) 


when  the  a and  d are  sum  of  squares  and  b and  c,  sum  of 
products  in  our  application. 


If  there  are  three  responses,  then  the  matrices  become 
larger  to  include  the  additional  sum  of  products  (e.g., 
betveen  responses  1 and  2,  2 and  3,  and  1 and  3.  Thus,  for 
three  responses,  the  total  matrix,  by  way  of  illustration, 
would  be: 

/ssti  spti2  sptn\ 

7’=(  sptl2  aSt2  Spt23  ) 

\sptl3  spt23  SSt3  / 

a symmetrical  matrix  with  the  sum  of  squares  for  each 
response,  1,  2,  and  3,  on  the  diagonal,  and  the  sum  of 
products  in  the  appropriate  columns  and  rows  off  the 
diagonal.  The  determinant  of  a 3 x 3 matrix  is: 


D 


a b c 
d e f 
g h i 


aei  + bfg  + dhc  - gee  - dbi  - ahf 


A computer  would  be  used  to  calculate  determinants  for 
larger  matrices. 
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APPENDIX  VIII 

ZAHN'S  GUARDRAILS  FOR  HALF-NORMAL  PLOTS 

Zahn  (1975a)  provides  critical  values  for  plotting 
guardrails  for  PER  = a =*  0.05,  0.20,  and  0.40  on  the  half- 
normal grids. 


For  version  S,  he  provides  them  only  for  N = 15, 

assuming  four  real  effects.  This  could  be  used  if  the  results 
8-4 

from  a 2 Iv  screening  design  were  plotted.  The  critical 
values,  taken  from  Zahn's  (1975a,  p 197)  Table  5,  are: 


N.  a 

R \ 

0.05 

0.20 

0.40 

15 

3.  37 

2.61 

2.20 

14 

3.00 

2.34 

1.97 

13 

2.61 

2.06 

1.76 

12 

2.21 

1.76 

1.51 

Unlike  Daniel's,  Zahr.'s  guardrails  will  appear  curved,  as 
shown  in  this  reproduction  from  his  Figure  9 (o  198) : 


Estimated 

Standard 

Deviation 


RANK: 

Zi,15: 
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Appendix  VIII  (Continued) 


For  version  X,  Zahn  (1975a,  p 195)  provides  the  critical 
values  for  N ■ 15,  31,  63,  and  127.  Taken  from  his  Table  7, 
the  critical  values  for  N -=  15,  31,  and  63  are: 


n * 15 

a 

R 

0.05 

0.20 

0.40 

15 

3.230 

2.470 

2.066 

14 

2.840 

2.177 

1.827 

13 

2.427 

1.866 

1.574 

12 

2.065 

1.533 

1.298 

n * 31 

31 

3.351 

2.730 

2.372 

30 

3.173 

2.586 

2.247 

29 

2.992 

2.439 

2.121 

28 

2.807 

2.288 

1.891 

27 

2.615 

2.133 

1.857 

n « 63  63 

3.470 

2.945 

2.629 

62 

3.  384 

2.872 

2.564 

61 

3.297 

2.797 

2.497 

60 

3.209 

2.722 

2.431 

59 

3. 120 

2.647 

2.363 

58 

3.030 

2.570 

2.295 

i 

r 
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